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Abstract 

This  paper  discusses  the  prevalence  of  Silicon  Valley-style  localizations  of  individual 
manufacturing  industries  in  the  United  States.  Several  models  in  which  firms  choose  lo- 
cations by  throwing  darts  at  a  map  are  used  to  test  whether  the  degree  of  localization  is 
greater  than  would  be  expected  to  arise  randomly  and  to  motivate  a  new  index  of  geo- 
graphic concentration.  The  proposed  index  controls  for  differences  in  the  size  distribution 
of  plants  and  for  differences  in  the  size  of  the  geographic  areas  for  which  data  is  available. 
As  a  consequence,  comparisons  of  the  degree  of  geographic  concentration  across  industries 
can  be  made  with  more  confidence.  We  reaffirm  previous  observations  in  finding  that  al- 
most all  industries  are  localized,  although  the  degree  of  localization  appears  to  be  slight  in 
about  half  of  the  industries  in  our  sample.  We  explore  the  nature  of  agglomerative  forces 
in  describing  patterns  of  concentration,  the  geographic  scope  of  localization,  and  the  extent 
to  which  agglomerations  involve  plants  in  similar  as  opposed  to  identical  industries. 


1     Introduction 

The  concentration  of  the  computer  industry  in  Silicon  Valley  and  of  the  auto  industry  in 
Detroit  are  two  of  the  more  famous  examples  of  the  geographic  agglomeration  of  firms  in 
a  single  industry.  The  economics  literature  motivated  by  these  examples  is  both  old  and 
vibrant.  Agglomerations  have  for  years  drawn  the  attention  both  of  urban  planners  with 
practical  concerns  and  of  economists  who  wish  to  understand  them  simply  because  they 
are  a  striking  feature  of  the  economic  landscape.  More  recently,  they  have  been  regarded 
also  as  a  potential  source  of  insights  into  the  nature  of  the  increasing  returns  and  external 
economies  which  drive  the  new  theories  of  growth  and  international  trade.  As  a  result, 
researchers  primarily  interested  in  international  trade,  growth,  industrial  organization,  and 
business  strategy  have  joined  geographers  and  urban  economists  in  investigating  geographic 
concentration.^ 

This  paper  is  concerned  with  measurement  issues  relevant  to  work  in  all  these  fields. 
Our  "dartboard  approach"  to  studying  concentration  consists  essentially  of  extending  the 
analogy  of  firms  choosing  locations  by  throwing  darts  at  a  map  into  a  useful  set  of  models  of 
location  choice  in  the  presence  of  agglomerative  forces.  In  doing  so,  we  have  two  main  goals. 
First,  we  wish  to  look  formally  at  whether  most  industries  are  truly  localized.  Second,  and 
more  importantly,  we  wish  to  use  the  models  to  guide  the  development  of  new  tools  for 
the  measurement  of  localization.  We  hope  that  the  index  of  localization  we  propose  will 
facilitate  future  research  into  a  range  of  topics  involving  cross-industry  comparisons,  e.g. 
how  patterns  of  agglomeration  compare  in  different  countries,  how  levels  of  concentration 
have  evolved  over  time,  and  whether  cross-industry  patterns  provide  insights  into  the  nature 
of  the  forces  which  cause  agglomeration. 

That  Silicon  Valley-style  agglomerations  may  be  more  the  rule  than  the  exception  has 
been  noted  by  a  number  of  authors  (see  e.g.  Krugman  (1991a)).  Our  first  goal  is  to  provide 
a  careful  reexamination  of  whether  this  is  indeed  the  case.  The  defining  characteristic  of 
our  dartboard  approach  (and  the  motivation  for  a  reexamination)  is  that  we  wish  to  reserve 
the  term  "localized"  for  industries  exhibiting  levels  of  concentration  beyond  those  which 
would  be  observed  if  firms  had  chosen  the  locations  of  their  plants  in  a  completely  random 
manner.  In  doing  so,  we  take  as  exogenous  the  discreteness  of  plants.^  For  example,  in 
the  U.S.  vacuum  cleaner  industry  (S.I.C.  3635)  about  75%  of  the  employees  work  in  one 
of  the  four  largest  plants.    Given  this,  we  do  not  want  to  regard  the  industry  as  being 


'For  samples  of  woik  in  these  fields  see  Florence  (1948),  Hoover  (1948),  Fuchs  (1962),  Carlton  (1983), 
Henderson  (1988),  Enright  (1990),  Porter  (1990),  Krugman  (1991a),  and  Jaife  et.  al.  (1993). 

^  We  do  not  mean  to  say  that  the  determinantion  of  plant  sizes  is  not  a  topic  with  interesting  implications 
for  understanding  increasing  returns,  just  that  it  is  usefully  separated  from  the  measurement  of  interplant 
agglomeration. 
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localized  simply  because  75%  of  its  employment  is  contained  in  four  states.  Also,  even  if 
firms  did  choose  locations  for  their  plants  by  throwing  darts  at  a  map,  one  should  recognize 
that  several  of  the  plants  by  chance  might  appear  to  form  a  cluster.^  Our  use  of  the 
term  geographic  concentration  is  further  restricted  in  that  we  will  regard  an  industry  as 
concentrated  only  if  it  displays  some  agglomeration  beyond  the  overall  concentration  of  U.S. 
manufacturing.''  For  example,  we  do  not  want  to  call  the  newspaper  industry  concentrated 
just  because  12%  of  all  employment  in  the  industry  is  in  California  and  an  additional  9% 
is  in  New  York.  Despite  this  more  stringent  definition  of  localization,  our  results  strikingly 
reaiRrm  the  belief  that  localization  is  widespread. 

Our  primary  focus  in  this  paper  is  on  the  development  of  a  new  index  (and  other  tool- 
s)  for  the  measurement  of  the  degree  to  which  industries  are  geographically  concentrated. 
We  believe  that  a  useful  index  of  geographic  concentration  must  have  two  properties:  it 
must  measure  something  which  is  interesting  to  economists  and  allow  one  to  make  com- 
parisons across  industries.  Such  comparisons  are  not  only  of  descriptive  interest,  but  are 
the  substance  of  most  inquiries  into  the  nature  of  geographic  concentration.^  Interindus- 
try (or  intertemporal)  comparisons  are  problematic  with  previously  defined  indices  because 
the  comparisons  are  greatly  affected  (in  ways  which  are  not  completely  understood)  by 
variations  in  industry  characteristics  and  data  availability.^ 

We  motivate  our  index  with  an  analysis  of  two  models  of  location  choice:  one  based 
on  the  idea  that  spillovers  {e.g.  localized  knowledge  spillovers^)  may  lead  firms  to  wish  to 
locate  together,  and  the  other  based  on  the  idea  that  firms  want  to  locate  wherever  some 
type  of  natural  advantage  {e.g.  access  to  raw  materials)  is  present.  Both  models  are  capable 
of  accounting  for  geographic  concentration,  and  they  are  likely  important  to  varying  degrees 


'^In  fact,  one  only  needs  to  thiow  6  darts  at  a  map  of  the  U.S.  before  it  is  most  likely  that  at  least  two 
will  hit  in  some  state.  Such  random  agglomerations  would  be  less  likely  to  occur  if  transportation  costs  or 
other  "centrifugal"  forces  give  firms  a  desire  to  locate  away  from  their  competition. 

*  We  thus  use  the  term  as  a  synonym  for  what  Henderson  (1988)  and  Krugman  (1991a)  call  localization  as 
opposed  to  what  they  term  urbanization  or  geographic  concentration.  We  hope  that  our  use  of  localization 
and  geographic  concentration  as  interchangeable  terms  does  not  create  confusion.  Again,  we  do  not  wish  to 
imply  that  the  overall  agglomeration  of  industries  is  not  an  interesting  topic,  just  that  it  is  usefully  separated 
from  an  examination  of  intraindustry  agglomeration. 

^For  example,  Krugman  (1991a)  discusses  whether  high  tech  industries  are  more  concentrated  than  other 
industries  to  investigate  the  importance  of  knowledge  spillovers  and  compares  the  U.S.  auto  industry  with  its 
European  counterpart  to  discuss  the  potential  impact  of  European  integration.  Earlier  comparative  works 
include  Florence's  (1948)  study  of  U.S.  and  British  industries  and  Fuchs's  (1962)  discussion  of  changes  in 
the  U.S.  between  1929  and  1954. 

*A  representative  set  of  these  indices  are  those  of  Creamer  (1943),  Florence  (1948),  Enright  (1990)  and 
Krugman  (1991a).  Florence's  observation  that  industries  with  larger  plants  are  more  concentrated  is  a 
particularly  clear  example  of  the  difficulties  in  interpreting  comparisons. 

^See  Krugman  (1991b)  for  a  discussion  of  other  spillovers. 


in  different  industries.  (Our  best  example  of  natural  advantage  is  the  wine  industry  where 
it  is  difficult  to  separate  manufacturing  from  the  growing  of  grapes  and  78%  of  employment 
is  in  California.  Our  best  example  of  spillovers  is  the  fur  industry  where  334  plants  in 
New  York  (most  in  Manhattan)  employ  77%  of  the  industry's  workforce.*)  The  main  point 
of  our  analysis  is  not  that  these  models  can  both  account  for  geographic  concentration, 
but  rather  that  regardless  of  which  mechanism  generates  geographic  concentration  in  a 
particular  industry  we  can  control  for  the  number  and  size  distribution  of  plants  «ind  for 
the  set  of  geographic  areas  for  which  data  is  available  in  the  same  way.  It  is  because  of  this 
coincidence  that  we  feel  somewhat  comfortable  that  our  index  may  control  for  these  factors 
in  the  real  world  as  well. 

While  the  paper  is  concerned  largely  with  methodology,  we  try  also  to  provide  as  detailed 
a  description  as  space  allows  of  geographic  concentration  in  U.S.  manufacturing  industries.^ 
After  all,  the  ultimate  test  of  an  index  is  whether  it  provides  enlightening  results.  First, 
we  discuss  overall  levels  of  concentration,  with  one  observation  being  that  many  industries 
are  only  slightly  concentrated  (with  a  substantial  fraction  of  what  others  have  identified 
as  concentration  being  attributable  to  the  discreteness  of  plants.)  Next,  we  discuss  briefly 
which  industries  are  concentrated.  Subsequently,  we  explore  the  nature  of  the  spillovers  (or 
other  forces)  causing  agglomeration  along  a  number  of  dimensions:  using  data  on  county, 
state,  and  regional  agglomeration  to  investigate  the  geographical  scope;  looking  at  whether 
their  influence  is  felt  within  narrowly  defined  industries  or  whether  these  spillovers  act  more 
broadly;  and  examining  the  degree  to  which  the  agglomeration  of  plants  occurs  internally 
within  firms. 

2     Models  of  Location  Choice 

In  this  section  we  develop  several  simple  models  of  location  choice.  These  models  will  be 
used  to  construct  a  test  of  whether  observed  levels  of  geographic  concentration  are  greater 
than  would  be  expected  to  occur  randomly,  and  to  motivate  our  subsequent  proposal  of  an 
index  of  geographic  concentration. 

As  a  practical  matter,  the  data  available  for  measuring  geographic  concentration  typical- 
ly consists  of  a  breakdown  of  an  industry's  total  employment  by  some  geographic  subunits, 
e.g.   we  may  find  state-by-state  employments  for  an  industry  in  the  U.S.  or  country-by- 


'Fuchs  (1957)  provides  an  excellent  discussion  of  the  industry. 

^The  raw  data  for  most  of  our  calculations  is  from  the  1987  Census  of  Manufactures.  We  have  gone  to 
some  length  to  fill  in  missing  state-industry  employment  data  so  that  we  may  analyze  the  complete  set  of 
manufacturing  industries.  We  have  also  estimated  the  Herfindahl  indices  of  the  plant  size  distributions  for 
each  4-digit  industry.  We  hope  that  this  data  may  prove  useful  in  future  work  as  well. 


country  employments  in  the  European  Community.  We  therefore  consider  an  abstract 
model  in  which  a  geographic  whole  (e.g.  the  U.S.)  is  divided  into  M  subunits  which  have 
shares  xi,X2,..  ■, x\f  of  aggregate  employment.  We  assume  the  shares  5i , sj, •  •  • , -sat  of  a 
given  industry's  employment  located  in  each  of  these  subunits  are  also  available.  With  such 
data,  a  natural  measure  of  the  degree  to  which  employment  in  the  industry  departs  from 
the  overall  pattern  of  employment  is 

M 

»=1 

We  feel  that  such  a  measure  is  of  economic  interest  in  that  it  emphasizes  departures  which 
involve  significant  fractions  of  an  industry's  employment.  We  will  focus  on  modifications  of 
this  measure  throughout  this  paper  both  because  the  measure  is  of  economic  interest,  and 
because  it  will  prove  easier  to  work  with  than,  say,  Gini  coefficients.^" 

2.1     A  Simple  Model 

We  begin  with  a  simple  model  we  will  use  to  ask  whether  the  concentration  of  employment 
within  industries  is  greater  than  would  be  expected  if  all  plants  were  located  in  an  inde- 
pendent random  manner.  We  view  the  "random"  choice  of  the  model  as  reflecting  what 
would  be  expected  in  an  industry  lacking  both  agglomerative  forces  (such  as  spillovers)  and 
centrifugal  forces  (such  as  transportation  costs  with  dispersed  demand). 

Consider  an  industry  consisting  of  N  business  units  having  shares  zi,Z2,.  ..,zn  of  the 
industry's  employment.  We  write  H  for  the  industry  Herfindahl  index^^  defined  by  ^  = 
JZjZJ.  Suppose  that  each  business  unit  chooses  a  single  location  for  all  of  its  operations 
within  a  country  which  is  divided  into  M  geographic  areas  having  shares  xi,X2,.. .  ,xm 
of  total  employment. ^^  As  a  model  of  random  location,  we  imagine  that  each  business 
unit  chooses  a  single  location  for  all  of  its  employees  by  throwing  a  dart  at  the  map  of 
the  country.  Formally,  we  suppose  that  the  geographic  areas  in  which  the  firms  choose  to 
locate  are  independent  identically  distributed  random  Vciriables  Vi,V2,...,vn,  each  taking 
on  the  values  1,2, . . . ,  M  with  probabilities  Pi,P2,  •  •  •  ,Pm-  We  can  think  of  the  probabilities 
Pi,P2i---',PM  ^  describing  the  relative  sizes  of  the  states  on  the  map.  In  trying  to  test 
whether  this  model  can  describe  the  geographic  concentration  of  U.S.  industries,  we  will 


'"Florence  (1948)  provides  a  lengthy  argument  for  a  similar  measure. 

''Note  that  our  definition  differs  from  the  conventional  use  of  the  term  both  in  that  we  will  usually 
think  of  plants  rather  than  firms  as  the  business  units  in  question  and  in  that  market  shares  are  shares  of 
employment  rather  than  shipments. 

'^We  think  of  the  industry  as  being  small  relative  to  the  country  so  that  the  {xi}  can  be  treated  as  fixed 
regardless  of  the  location  decisions  of  the  business  units  in  the  industry. 


usually  take  p,-  =  i,  for  all  i,  so  that  the  random  location  process  would  on  average  produce 
a  pattern  of  employment  shares  for  the  industry  matching  that  we  have  assumed  to  prevail 
in  the  aggregate.  ^^ 

Let  us  now  examine  the  degree  of  localization  such  a  model  would  produce.  The  fraction 
of  the  industry's  employment  located  in  geographic  unit  i  is 

N 
Si  =  53  ZjUji, 

J=l 

where  Uj,  is  the  Bernoulli  random  variable  equal  to  one  if  and  only  if  Vj  =  i.  Define  a 
normalized  measure,  G,  we  will  refer  to  as  the  raw  geographic  concentration  of  the  industry 
by 

r  -  E.(^.-  -  =^,)' 
""-    1-E.x?  • 

Proposition  1  characterizes  the  raw  geographic  concentration  produced  by  this  model. 
The  fact  that  we  get  such  a  simple  answer  with  the  expected  value  of  G  depending  only 
on  H  and  not  on  any  details  of  the  plant  size  distribution  is  not  only  interesting,  but  also 
useful  in  that  detailed  data  on  plant  sizes  may  be  hard  to  come  by. 

Proposition  1  In  the  model  above, 

1-E.p?^  ,  E.(p.-^.)' 


E{G)  =  ;   ±c  2^  + 


1-E.x?    "    1-E.x?  • 

For  (pi, . . .  ,PAf )  =  {x\,..-i xm)  this  reduces  to 

E{G)  =  n. 

Proof 

The  result  follows  from  a  straightforward  calculation  using  the  fact  that  the  expectation 
of  a  sum  of  random  variables  is  the  sum  of  the  expectations  regardless  of  whether  the  random 
variables  are  independent. 

{l-Yix])E{G)    =    EiY^isi-Xif) 

i  i 

=    Y^Eiisi-pi  +  p,-x,)') 
t 

=    J3Var(5,)  +  X:(p.-x.)^ 


'^We  emphasize  that  by  doing  so  we  are  taking  as  given  the  concentration  of  aggregate  employment, 
even  though  this  may  be  thought  of  as  resulting  from  (nonindustry-specific)  interiirm  spillovers.  We  are 
interested  in  exploring  intra-industry  localizations,  not  in  the  fact  that  there  is  virtually  no  manufacturing 
in  the  state  of  Wyoming. 


Using  Si  =  XIj  -s^jWit  aJid  that  the  Uji  for  j  =  1,2, . . . ,  N  are  independent  Bernoulli  random 
variables  we  have 

t  i        j  i 

«■  J  « 

I  i 

as  desired. 
QED. 

To  help  get  some  intuition  for  this  result,  it  may  help  to  note  why  it  holds  in  a  couple 
of  limiting  cases  (assuming  that  p,  =  x,  for  all  i).  First,  for  any  fixed  set  of  geographic 
areas,  the  limit  as  ^  ^^  0  describes  an  industry  with  an  infinite  number  of  small  firms.  In 
this  case  the  law  of  large  numbers  dictates  that  a  fraction  i,  of  the  industry's  employment 
will  be  in  geographic  unit  i  and  G  will  be  zero.  Next,  for  any  fixed  firm  size  distribution 
imagine  that  the  sizes  of  the  geographic  areas  become  arbirarily  small,  i.e.  let  M  — ►  oo  with 
max,i,  — >  0.  With  only  a  finite  number  of  firms  we  can  in  the  limit  ignore  the  probability 
of  two  darts  hitting  any  geographic  unit.  The  value  of  («,  —  i,)^  will  then  be  approximately 
zj  if  business  unit  j  is  located  in  area  i  and  0  otherwise.  Hence,  we  can  see  that  the  sum 
of  squared  deviations  will  approach  the  Herfindahl  index. 

The  result  also  gives  us  our  first  intuitive  interpretation  of  a  measure  of  concentration. 
If  an  industry  has  raw  concentration  G,  we  can  think  of  the  distribution  of  employment  in 
the  industry  as  being  as  concentrated  as  would  be  expected  if  ^  randonily  selected  locations 
each  had  a  fraction  G  of  the  industry's  employment. 

In  testing  whether  a  set  of  industries  exhibits  excess  geographic  concentration,  it  is 
useful  also  to  know  the  variance  of  G  in  this  model.  The  expression  is  not  as  simple  as  that 
for  the  expectation,  and  depends  also  on  the  fourth  moment  of  the  distribution  of  business 
unit  sizes. 

Proposition  2  For  {pi,p2,...,PM)  =  (xi,I2,...,xm)  in  the  model  above 

The  result  follows  from  a  straightforward  but  tedious  calculation,  which  we  omit. 


2.2     Two  Models  of  Localization 

We  now  discuss  two  additional  models  of  the  location  decision  process,  each  of  which  Is 
capable  of  explaining  localization  in  excess  of  that  predicted  in  the  simple  model  above. 
The  models  will  thus  be  useful  in  developing  an  index  of  the  extent  to  which  an  industry 
exhibits  excess  geographic  concentration. 

The  models  concern  location  choices  which  are  influenced  not  only  by  aggregate  em- 
ployment, but  also  by  the  "natural  advantages"  to  locating  in  certain  areas  and  by  localized 
intraindustry  "spillovers."  In  order  to  discuss  how  these  factors  should  be  incorporated, 
it  is  helpful  first  to  recast  the  dartboard  model  of  the  previous  section  in  more  economic 
terms.  Specifically  suppose  that  in  an  industry  like  that  described  above,  each  business 
unit  locates  in  whichever  state  maximizes  its  profits,  and  that  the  profits  received  by  the 
A:"*  unit  when  it  locates  in  area  i  take  the  form 

logTTfc,-  =  log7i7+ffci, 

where  7r7  is  a  measure  of  the  average  profitability  of  area  i  and  fjt,  is  a  random  variable 
reflecting  idiosyncratic  elements  of  the  suitability  of  the  area  to  the  firm  in  question  (because 
of  fixed  firm  characteristics,  preferences  of  its  management,  the  success  of  its  search  for  a 
site,  etc.).  If  we  assume  that  the  {(-ki)  are  independent  and  have  the  Weibull  distribution, 
then  it  is  a  standard  result  that  firm  /;'s  location  vjt  is  a  random  variable  with 


T,- 
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Prob{ujt  =  i)  =  —  _■ 

Our  standard  dartboard  model  can  be  obtained  as  a  special  case  by  assuming  that  the  states 
have  no  distinguishing  features  which  affect  their  average  profitability  other  than  differences 
in  aggregate  employment,  and  that  the  positive  spillover  of  aggregate  employment  on  profits 
takes  the  form  Wi  =  n.  With  this  specification 

Prob{t;jt  =  i}  =  =-^—  =  x,-. 

Because  this  dependence  of  average  profits  on  aggregate  employment  leads  to  location 
choices  which  on  average  recreate  aggregate  agglomeration  given  the  error  structure  we 
have  assumed,  we  shall  take  it  as  a  starting  point  for  our  subsequent  models.  ^^ 


"Sec  McFadden  (1973). 

'^Rather  than  thinking  of  this  dependence  as  reflecting  aggregate  spillovers,  it  is  also  possible  to  obtain 
such  a  relation  indirectly  by  assuming  that  the  profitability  at  each  potential  site  b  independent  and  ex- 
ante  identical,  bnt  that  larger  states  have  more  sites  to  choose  from  (proportionally  to  their  aggregate 
employment)  so  that  the  best  location  in  a  larger  state  is  on  average  superior. 


2.2.1      A  Model  of  Natural  Advantage 

Our  first  model  of  industry  localization  is  motivated  by  the  observation  that  the  business 
units  in  an  industry  will  appear  to  be  clustered  whenever  their  location  decisions  are  influ- 
enced by  factors  which  can  be  regarded  as  giving  a  "natural"  advantage  to  certain  of  the 
geographic  areas.  Our  prototypical  example  is  the  wine  industry.  Clearly,  the  localization 
of  the  industry  in  is  in  large  part  due  to  California's  climatic  natural  advantage  in  growing 
grapes.  Similarly,  the  concentration  of  industries  which  import  or  export  bulky  commodi- 
ties in  coastal  states  reflects  a  natural  advantage  in  access  to  transportation.^^  Perhaps 
because  such  factors  so  straightforwardly  lead  firms  to  cluster  together  they  have  generally 
received  less  attention  than  spillovers  in  discussions  of  industry  localization.  They  are, 
however,  an  essential  component  of  a  complete  description  of  agglomeration. 

The  simplest  way  to  add  natural  advantage  to  the  location  choice  model  described  above 
is  to  assume  that  firm  fc's  profits  when  it  locates  in  state  i  are  again  of  the  form 

logXfc,  =  logx7-f-€fci, 

but  with  the  average  profitability  of  state  i,  Wi,  now  taken  to  be  a  nonnegative  random 
variable  reflecting  all  of  the  ways  in  which  nature  has  chosen  to  make  state  i  unique  (which 
affect  profits  in  the  same  way  for  all  plants).  Conditional  on  a  realization  of  the  {t^},  the 
probability  that  each  business  unit  locates  in  state  :'  is 

The  larger  are  the  differeces  between  the  p's  and  the  I's,  the  more  we  can  think  of  locational 
patterns  as  being  influenced  by  natural  advantage. 

We  analyze  a  specification  of  this  model  in  which  the  importance  of  natural  advantage 
is  neatly  parameterized  by  a  single  constant  70  G  [0, 1],  by  assuming  that  the  state  profit 
levels  {Wi}  are  independent  of  the  {cjti},  and  that  their  distribution  is  such  that  E{pi)  =  x, 
and  Var(p,)  =  7oXt(l  —  Xi).^^  Note  that  when  70  =  0,  there  are  no  common  shocks  and 
we  obtain  our  standard  dartboard  model  of  random  locations.  At  the  other  extreme,  when 
70  =  1  each  Pi  has  the  largest  possible  variance  given  its  mean  and  support,  so  that  with 
probability  one  the  dlflferences  in  state  characteristics  are  so  extreme  that  all  business  units 
will  cluster  in  a  single  state. 

To  explore  the  level  of  raw  geographic  concentration  such  a  model  produces  for  inter- 
mediate 70  and  how  this  depends  on  the  structure  of  the  industry,  it  is  helpful  to  restate 


'^One  formal  study  of  such  an  effect  is  Cailton  (1983),  which  finds  that  energy  prices  are  an  important 
determinant  of  plant  location  decisions  in  several  industries. 

'^For  example,  one  could  assume  that  tT  =  x,  -(-  •;■  with  the  {i/,}  being  mean  zero  random  variables  with 
^.  1;,  =  0  (with  probability  one)  and  Var(T;i)  =  7oz;(l  —  z,)-  Another  example  is  given  later  in  this  section. 
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the  model  using  a  dartboard  metaphor.  We  can  think  of  the  business  units'  choices  of 
location  as  a  two  stage  process.  In  the  first  stage  nature  chooses  (from  some  set  of  possible 
dartboards)  a  single  dartboard  on  which  the  geographic  areas  have  sizes  Pi,P2,---,PM, 
reflecting  the  importance  and  allocation  of  comparative  advantage  across  the  areas  (the 
larger  areas  being  those  with  greater  average  profits).  In  the  second  stage,  all  business 
units,  being  influenced  by  the  same  levels  of  comparative  advantage,  independently  throw 
darts  at  this  board  to  choose  their  locations. 

The  following  proposition  shows  that  the  expected  raw  concentration  is  linearly  increas- 
ing in  7o  and  again  depends  on  the  distribution  of  the  plant  sizes  only  through  H. 

Proposition  3  In  the  two  stage  model  of  comparative  advantage  described  above 

E{G)  =  fQ  +  il-fo)n. 

Proof 

Using  the  result  of  Proposition  1  we  have 

We  have  assumed  that  E{pi)  =  Xi,  and  Var(p,)  =  7oii(l  —  x,).  Hence, 

(l-^x^EiG)    =    ^(l-J3x?  +  7oXi(l-x.))  +  5:7ox.(l-x.) 
1  t  t 

=     ^(l-7o  +  (7o-l)^x?)  +  7o(l-^x?) 

t  i 

< 

as  desired. 
QED. 

For  concreteness,  it  may  help  to  note  that  one  specification  satisfying  the  conditions 
above  is  obtained  by  assuming  that  the  {x7}  are  independent  random  variables  with  Wi 
having  a  chl-square  distribution  with  2^^^Xi  degrees  of  freedom.  The  induced  distribution 
of  p,-  =  ^''_  is  then  /?(i^^x,-,  ^r?^(l  —  x,)),  and  hence  has  mean  x,-  and  variance  7oXi(l  — 

2^j  *>  TO  TO 


(1 


**More  generally  the  same  distribution  for  pt  b  obtained  whenever  x7  ~  r('~'"'ii,  A).  The  joint 
distribution  of  (pi,...,pm)  is  Dirichlet  vnth  parameters  {^-^xi,...,—^xm)-  See  Johnston  and  Kotz 
(1972,   p.      231)  for  a  description  of  this  distribution.      The  density  function  is  f(yi,...,yM-i)    = 


2.2.2     A  Model  of  Spillovers 

Our  second  model  of  industry  localization  is  motivated  by  the  idea  that  externalities  or 
spillovers  may  lead  firms  to  desire  to  locate  their  plants  near  other  plants  in  the  industry. 
We  use  the  term  spillovers  quite  broadly  here  to  refer  to  technological  spillovers,  gains 
from  interfirm  trade,  the  effect  of  local  knowledge  on  the  location  of  spinoff  firms,  etc.  - 
essentially  any  forces  which  lead  firms  to  choose  locations  near  other  firms  in  the  industry. 
To  model  such  factors,  one  might  assume  that  the  profit  of  business  unit  k  if  located  in 
area  i  is  of  the  form 

log  TTki  =  log7f(a:.,  vi,...,  Vi.i,Vi+i,VM)  +  ffci, 

where  as  before  Vj  is  the  location  of  plant  j.  This  formulation  allows  average  profits  within 
a  state  to  be  affected  generally  by  both  the  aggregate  employment  and  the  location  of 
the  other  plants  in  the  industry  (but  not  by  state  characteristics).  To  make  the  emalysis 
tractable  and  to  aid  interpretation,  we  again  examine  a  simple  parametric  specification  of 
this  model.  In  particular,  we  consider  for  70  €  [0, 1]  profit  functions  of  the  form 

logTTi,-  =  log(l,)  +  ^  ejk/(l  -  Uti){-00)  +  €fc,, 

where  the  {cjt/}  are  Bernoulli  random  variables  equal  to  one  with  probability  70,  ua  is  an 
indicator  for  whether  V(  =  i,  and  the  {ejtt}  are  again  independent  WeibuU  random  variables 
independent  of  the  {ejt/}. 

To  motivate  this  formulation,  note  that  the  first  term  log(i,)  is  the  same  dependence 
of  profits  on  aggregate  employment  necessary  to  reproduce  (on  average)  the  pattern  of 
aggregate  employment.  The  second  expression  involves  two  main  assumptions  made  largely 
for  tractability.  First,  we  have  assumed  that  the  effect  of  plant  Vs  location  on  plant  fc's 
profit  depends  only  on  whether  they  are  in  the  same  area,  not  on  the  distance  between 
different  areas.  Second,  rather  than  assuming  a  continuous  distribution  for  the  magnitude 
of  the  spillovers,  we  take  the  spillovers  to  have  an  extreme  two  point  support  -  they  are 
either  strong  enough  so  that  firms  k  and  i  wUl  have  negative  infinity  profits  if  they  locate 
apart,  or  they  are  nonexistent  so  that  k^s  profits  are  independent  of  ^'s  location.  As  the 
probability  that  any  pair  of  firms  has  such  a  crucial  spillover  between  them,  70  clearly 
indexes  the  importance  of  spillovers. ^^ 

In  this  spillover  model,  one  needs  to  be  more  careful  in  specifying  the  decision  processes 
of  the  business  units.  We  assume  the  the  business  units  choose  locations  in  some  preor- 
dained order,  and  that  each  firm  in  turn  maximizes  its  profits  conditioning  oidy  on  the 

"Note  that  in  a  violation  of  accepted  practice  are  using  70  to  represent  a  completely  different  parameter 
here  than  in  the  previous  model.  We  have  made  this  decision  to  emphasize  that  the  predicted  mean 
concentration  of  the  two  models  wUl  turn  out  to  be  identical. 
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location  decisions  of  the  firms  which  have  moved  previously.  We  shall  assume  also  that  the 
indicator  variables  {eke}  for  whether  spillovers  exist  between  pairs  of  firms  are  symmetric 
and  transitive  in  the  sense  that  cfc/  =  1  =J>  e/jt  =  1  and  €«  =  1  and  e/„,  =  1  ::^  Cfcm  =  1.^°  In 
this  case,  the  process  we  have  specified  is  a  rational  expectations  equilibrium  in  which  each 
firm  earns  nonnegative  profits  and  the  resulting  distribution  of  locations  is  independent  of 
the  order  in  which  the  business  units  make  their  choices.  Note  that  for  70  =  0  the  model  is 
again  our  standard  dartboard  model,  and  for  70  =  1  all  firms  wiU  cluster  in  a  single  area. 

To  analyze  the  geographic  concentration  such  a  model  produces  it  helps  again  to  think 
of  the  firms'  location  choices  in  terms  of  a  dart  throwing  metaphor.  Each  business  unit  is 
represented  by  a  dart  which  will  be  thrown  to  choose  a  location.  In  the  first  stage,  nature 
randomly  decides  to  weld  some  of  the  darts  together  into  clusters,  with  the  distribution 
of  her  decisions  being  such  that  each  pair  of  darts  has  probability  70  of  being  in  the  same 
welded  cluster.  In  the  second  stage,  each  cluster  of  welded  darts  is  thrown  independently 
(with  all  darts  in  a  cluster  hitting  a  single  point). 

In  this  model,  the  business  units'  locations  ui, . . . ,  u;v  are  identically  distributed  random 
variables,  each  taking  on  the  value  i  €  {1,2,...,M}  with  probability  i,.  Note,  however, 
that  the  {vj}  are  not  independent;  instead  it  is  straightforward  to  show  that  Corr(ufc,,u/,)  = 
7o  for  all  i  and  all  £  /  k?^  Proposition  4  characterizes  the  raw  geographic  concentration 
produced  by  this  model. 

Proposition  4  In  the  model  of  spillovers  described  above 

E{G)  =  ^o  +  il-lo)H. 

Proof 

«  «■  i 

i       j  i     j,k,j^k 

i       j  i     },k,j^k 

i  j,k,j^k 


^"Note  that  we  have  not  fully  spedAed  the  joint  distribution  of  the  {ek/}.  The  proposition  below  will  apply 
to  all  distributions  with  the  properties  above.  To  see  that  at  least  one  such  joint  distribution  exists  consider 
the  case  of  the  {ckt}  being  perfectly  correlated,  so  that  with  probability  70  all  the  firms  are  completely 
interdependent  and  with  probability  1  —  70  all  of  their  profits  are  independent. 

^'The  reader  may  note  that  this  is  the  only  property  of  the  joint  distribution  which  is  necessary  for 
the  proposition,  and  hence  the  proposition  applies  to  any  formulation  of  the  interdependent  profits  which 
induces  this  correlation  in  location  choices. 
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The  desired  result  now  follows  from  the  substitution  Z^j^jt  ZjZk  =  (I3j  ■^j)^ ~5Zj  z]  =  I- H. 
QED. 

The  most  notable  feature  of  the  result  in  Proposition  4  is  that  our  model  of  spillovers  and 
our  model  of  natural  advantage  yield  identical  functional  forms  for  the  relationship  between 
the  expected  level  of  geographic  concentration  and  the  other  industry  characteristics  (the 
plant  size  distribution  and  the  sizes  areas  for  which  employment  breakdowns  are  available). 
This  coincidence  motivates  the  index  of  concentration  proposed  below.  The  coincidence  of 
the  results  of  Propositions  3  and  4  also  tells  us,  in  some  sense,  that  we  cannot  distinguish 
comparative  advantage  from  spillover  theories  based  only  on  the  mean  levels  of  geographic 
concentration.^^ 

Real  world  location  decisions  are  likely  to  be  affected  by  both  natural  advantage  and 
by  spUlovers,  so  it  is  probably  worth  noting  that  a  combination  of  the  two  factors  also 
produces  a  level  of  raw  concentration  which  is  related  to  the  industry  characteristics  in  the 
same  way.  Specifically,  consider  a  three  stage  model  (we  give  only  the  dartboard  version) 
where  in  the  first  stage  Nature  chooses  (pi,P2,  •  •  •  iPm)  fro™  a  distribution  with  E{pi)  =  i, 
and  Var(p,)  =  7ix,(l  —  x,);  in  the  second  Nature  randomly  welds  each  pair  of  darts  with 
probability  72;  and  in  the  third  the  welded  clusters  are  independently  thrown  at  a  dartboard 
in  which  the  states  have  sizes  {pi,P2, •  ■  ■  iPm)- 

Proposition  5  In  the  three  stage  model  above 

with  70  =  7i  +  72  -  7i72- 

The  proof  is  similar  to  those  of  the  previous  propositions  and  is  therefore  omitted. 

3     An  Index  of  Geographic  Concentration 

Suppose  we  axe  given  data  containing  the  shares  si,  53, . . .  s\f  of  an  industry's  employment 
in  each  of  M  geographic  areas,  the  shares  ii,i2,...,XAf  of  total  employment  in  each  of 
those  areas,  and  the  Herfindahl  index  H  =  J2j=:i  ^]  of  the  industry  plant  size  distribution. 


^^The  theories  wiU  differ  in  their  predictions  for  higher  moments  of  G.  Recall,  however,  that  we  have  not 
fully  specified  either  model  (leaving  out  the  higher  moments  of  the  {p.}  in  the  natural  advantage  model 
and  the  full  joint  distribution  of  the  {ckt}  in  the  spillover  model).  Varying  these  elements,  each  model  can 
produce  a  range  of  predictions  for  Var(G).  For  this  reason,  we  do  not  feel  that  attempts  to  distinguish  the 
theories  on  such  grounds  will  be  fruitful. 


12 


As  a  convenient  index  of  the  degree  to  which  an  industry  is  geographically  concentrated  we 
propose  the  use  of  a  measure  7  defined  by 


_G-H  _  Zfiijs,  -  x.)^  -  (1  -  Ei^i  x?)Ef=i  4 

(i-E,^ix?)(i-Ef=,^^) 


^^^  ''-  l-H  -  (\-TK,  x?¥1  -  T^_,  z? 


We  believe  that  this  index  has  a  number  of  attractive  features.  It  reflects  a  property  which 
is  naturally  meaningful  in  emphasizing  large  deviations  from  the  distribution  of  aggregate 
employment.  Because  E{f)  =  0  when  data  is  generated  by  our  standard  dartboard  model, 
it  is  clearly  interpretable  as  measuring  excess  concentration  beyond  that  which  would  be 
expected  to  occur  randomly.  Finally  and  most  importantly  the  index  allows  us  to  easily 
perform  meaningful  comparisons  of  the  degrees  of  concentration  in  different  industries, 
e.g.  comparing  a  U.S.  industry  with  its  counterpart  in  another  country,  or  comparing 
concentration  using  3-  and  4-digit  industry  definitions. 

To  justify  such  comparisons,  we  note  simply  that  if  the  location  decision  process  of 
plants  is  accurately  described  by  either  or  both  of  the  natural  advantage  and  spillover 
models  of  the  previous  section  then 

Eil)  =      ^_g      =  70, 

i.e.  the  index  is  an  unbiased  estimator  of  the  fundamental  parameter  which  describes  the 
strength  of  natural  advantage  or  spiUovers.  That  the  index  controls  for  the  number  and 
size  distribution  of  plants  and  (subject  to  the  caveat  below)  for  the  sizes  of  the  geographic 
subunits  for  which  data  is  available  in  both  of  these  modes  gives  us  some  hope  that  it  will 
allow  us  to  compju-e  the  strength  of  these  forces  in  real  world  industries  as  well. 

In  making  the  transition  from  the  models  to  the  real  world  one  caveat  is  necessary  with 
regard  to  comparisons  based  on  differing  geographic  subunits.  Each  of  our  models  takes 
an  extreme  view  of  the  geographic  scope  of  the  forces  which  produce  localization.  For 
example,  when  spillovers  are  important  they  are  assumed  to  accrue  only  if  the  firms  locate 
in  the  identical  geographic  subunit.  In  practice,  spillovers  would  likely  have  an  effect  which 
declines  more  smoothly  and  provides  some  benefit  to  locating  in  nearby  areas  as  well  (more 
so  when  the  areas  in  question  are  smaller).  If  we  estimate  7  using  county  level  data,  it  will 
reflect  only  the  added  probability  of  pairs  of  plants  locating  in  the  same  county,  while  a  7 
estimated  from  state  level  data  will  reflect  the  typically  larger  increase  in  the  probability 
of  the  pair  locating  in  the  same  state.'^ 


^^The  location  decision  process  of  the  natural  advantage  model  depends  heavily  on  the  definition  of 
the  subareas,  and  thus  the  conditions  under  which  we  can  compare  estimates  based  on  different  geographic 
subunits  are  less  obvious.  One  case  in  which  such  comparisons  are  completely  justified  is  that  of  the  gamma- 
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The  fact  that  we  can  regard  7  as  a  parameter  estimate  clearly  suggests  that  it  should  be 
interpreted  in  light  of  its  standard  error.  What  this  standard  error  is,  however,  can  not  be 
determined  given  the  assumptions  we  have  made  so  far.  In  particular,  (and  we  consider  it 
a  feature  of  our  paper  that  this  is  true)  the  results  on  mean  concentration  above  have  been 
derived  without  ever  specifying  the  higher  moments  of  the  {Wi}  in  the  natural  advantage 
model  or  joint  distribution  of  the  indicators  {e«}  in  the  spillover  model.  A  straightforward 
calculation  of  the  standard  errors  gives 

Var(7  -  70)  =   (l_JJ2)(l_yx^)2    S         S        ZjZkZtZ^CovinjiUki^lHrtimr). 

The  covariance  terms  wiU  depend  on  the  unspecified  elements  of  the  models.  To  give  a 
feel  for  the  magnitude  of  one  of  the  sources  of  measurement  error  in  our  calculations  of 
7's,  we  will  present  later  standard  errors  (obtained  from  simulations)  from  one  complete 
specification  -  a  natural  advantage  model  where  the  distribution  of  the  (pi,P25  •  •  •  ,Pm)  is 
aissumed  to  be  Dirichlet  with  parameters  (i^xi,  ^-^12, ... , ^^/^xm).'^* 

4     Data 

By  design,  the  data  requirements  for  this  paper  are  fairly  simple.  We  require  the  distribu- 
tion of  employment  across  a  set  of  geographic  areas  for  a  set  of  industries  and  the  Herfindahl 
indices  of  firm  and  plant  employment  shares  for  those  industries.  Our  definition  of  indus- 
tries is  the  finest  one  possible  given  data  availability  constraints  -  the  459  manufacturing 
industries  defined  by  the  4-digit  classifications  of  the  Census  Bureau's  1987  S.I.C.  system. 
Given  this  decision,  we  settled  on  the  finest  geographic  areas  for  which  we  felt  we  could 
obtain  reliable  employment  breakdowns  -  the  50  states  plus  the  District  of  Columbia.  Our 


distributed  {77}  described  at  the  end  of  Section  2.2.1.  In  this  model,  we  can  legard  natures  choice  of  a 
dartboaid  as  leaulting  from  an  assignment  of  a  probability  (or  more  precisely  of  an  independent  gamma- 
distributed  xT)  to  each  square  inch  of  the  country,  with  the  probability  of  a  dart  hitting  each  state  (or  other 
subunit)  being  obtained  by  summing  the  probabilities  of  its  hitting  each  of  the  square  inch  plots  within 
the  state.  More  formally,  suppose  that  geographic  area  1  is  divided  into  subareas  11, 12, . . . ,  Ir  with  shares 

xii,--,*ir  of  total  employment  (with  zu  H \- x\r  =  n).   When  (pu Pir,P3,  ■  ■  ■  ,Pm)  is  Dirichlet 

with  parameters  (i=^iii, . . . ,  ^=?^*ir,  ^*2.  ■  •  ■ .  ^^'m)<  (p>»  +  •  •  +Pir,P3,  •  • .  ,Pm)  is  Dirichlet  with 
parameters  ('"^'xi,  '"""Ta,... ,  ^~''^xm).  Note  that  this  specification  is  extreme  as  well  in  that  natural 
advantatges  of  nearby  square  inches  are  likely  to  be  spatially  correlated,  so  again  we  would  expect  to  find 
larger  7's  when  using  coarser  subdivisions. 

^*One  question  of  interpretation  arises  in  defining  the  standard  errors.  Do  we  treat  7  as  an  estimate  of 
the  variance  of  the  ex  ante  distribution  from  which  the  {pi)  are  drawn  or  as  an  estimate  of  the  ex  post 

.. ._..„„  „.      '  ^  '^    .  We  take  this  latter  interpretation,  drawing  {p.}  for  which  this  expression  is  equal 

to  7  from  the  conditional  distribution  induced  by  the  Dirichlet. 
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source  for  all  of  this  data  is  the  1987  Census  of  Manufactures,  although  despite  our  simple 
data  requirements  the  process  of  constructing  the  data  is  quite  involved.  While  the  data 
we  have  filled  in  is  by  necessity  speculative,  we  hope  that  our  data  may  be  helpful  to  others 
in  the  future. 

Our  construction  of  state-industry  employments  relies  on  the  Census  of  Manufactures' 
listings  of  state-industry  employments  and  on  the  reported  totals  for  manufacturing  em- 
ployment in  each  state  and  in  each  2-,  3-,  and  4-digit  industry.  A  very  substantial  data 
filling  procedure  was  necessitated  by  two  limitations  of  the  raw  data.  First,  employment  in 
any  state-industry  with  fewer  than  150  employees  is  omitted  from  the  raw  data  (presumably 
to  save  space).  Because  states  with  less  than  150  employees  contain  a  nontrivial  fraction 
of  employment  in  some  industries,  it  is  desirable  to  fill  in  these  numbers.  Second,  and 
more  importantly,  to  protect  the  confidentiality  of  individual  responses  the  Census  often 
reports  state-industry  employment  only  as  falling  into  one  of  five  categories  corresponding 
to  employments  of  100-249,  250-499,  500-999,  1000-2500,  and  the  unfortunately  low  top- 
code  of  2500-1-.'^^  To  give  some  idea  of  the  magnitude  of  these  restrictions,  simply  setting 
employment  in  each  of  these  cells  to  its  lower  bound  unequivocally  identifies  the  location 
of  90%  of  employment  in  the  median  industry  and  80%  on  average. 

To  complete  our  data  set,  we  have  filled  in  data  for  all  2-,  3-,  and  4-  digit  state- 
employments  using  the  census  data  and  the  adding  up  constraints  across  states  within 
industries  and  across  subindustries  within  states.  Our  algorithm  is  based  on  the  idea  of 
imposing  the  upper  and  lower  bounds  of  the  reported  ranges,  and  adjusting  employments 
within  the  ranges  to  try  to  satisfy  the  adding  up  constraints  in  both  directions.  Data  for 
state-industries  with  less  than  150  employees  is  filled  by  a  similar  procedure  which  uses  also 
the  number  of  "missing"  establishments  created  by  the  nonreporting  of  these  cells.  More 
details  are  given  in  Appendix  A.^^ 

Throughout  most  of  the  paper  we  will  think  of  plants  as  the  business  units  of  the  model, 
and  thus  rely  on  a  Herfindahl  index  Hp  of  the  employment  shares  of  plants  to  control  for  the 
size  distribution  of  business  units.  While  the  Census  does  not  publish  this  information  it 
does  make  available  for  each  4-digit  industry  the  total  employment  and  the  total  number  of 


^^Fortunately,  employment  in  many  topcoded  cells  is  relatively  small,  because  the  largest  state  employ- 
ments tend  to  occur  in  states  with  several  firms  so  that  withholding  restrictions  do  not  apply. 

^'An  alternate  approach  used  by  several  past  authors,  e.g.  Enright  (1990),  is  to  reduce  the  number  of 
topcodes  by  obtaining  data  from  the  County  Business  Patterns  (which  does  not  have  an  identical  sample, 
but  which  has  a  much  higher  topcode)  and  use  means  of  ranges,  dropping  industries  where  topcodes  can 
not  be  avoided  or  where  the  ranges  are  too  large.  Drawbacks  of  such  an  approach  are  that  the  important 
information  in  the  adding  up  constraints  (and  often  the  existence  of  small  state-industries)  is  being  ignored, 
and  that  industries  with  interesting  agglomerations  often  end  up  being  dropped. 
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plants  in  each  often  employment  size  ranges.'^^  Subject  to  disclosure  rules,  the  total  number 
of  employees  within  the  plants  of  a  size  category  is  also  usually  reported.  The  withholding  is 
somewhat  more  of  a  problem  here  than  in  the  construction  of  the  state  employments  because 
it  is  primarily  the  shares  of  the  largest  plants  which  are  obscured  by  the  nondisclosure  rules. 
For  each  industry  we  used  the  data  to  estimate  a  sum  of  squared  plant  shares  using  a  two 
step  procedure:  employees  were  first  allocated  between  the  classes  where  data  was  withheld, 
and  a  Herfindahl  index  was  then  estimated  by  a  procedure  similar  to  that  recommended 
by  Schmalensee  (1977),  but  taking  into  account  the  additional  information  available  here 
in  the  form  of  the  category  divisions.  To  get  a  rough  idea  of  the  magnitude  of  the  resulting 
measurement  errors  we  conducted  tests  of  our  algorithm  on  simulated  data.  The  details  of 
the  data  construction  and  of  the  simulations  are  reported  in  Appendix  B.  The  simulations 
suggest  that  measurement  errors  are  not  likely  to  substantially  bias  our  results.  A  complete 
list  of  our  estimated  plant  Herfindahls  is  given  in  Appendix  C. 

A  firm  level  Herfindahl  index,  Hj^  was  taken  directly  from  the  Census  of  Manufactures' 
Concentration  Ratios  in  Manufacturing.  The  data  are  based  on  shares  of  shipments  of 
the  top  50  firms  rather  than  on  all  firms'  employment,  and  are  available  for  444  of  the 
459  industries,  with  the  values  for  the  remaining  fifteen  (highly  concentrated)  industries 
withheld  because  of  disclosure  rules.  We  drop  these  industries  whenever  our  analysis  uses 
Hf.  In  the  restricted  sample  of  444  industries,  the  mean  of  Hj  is  0.068  and  that  of  Hp  is 
0.025.  It  is  interesting  to  note  as  an  aside  that  following  Scherer  (1975)  one  may  use  the 
ratio  of  these  two  concentration  measures  for  an  industry  as  an  estimate  of  the  effective 
number  of  plants  operated  by  the  large  firms.^*  In  doing  so,  we  find  a  mean  across  industries 
of  3.8  plants/firm  and  a  median  of  2.7,  figures  roughly  comparable  to  those  reported  by 
Scherer  twenty  years  ago.  The  ratio  takes  on  its  largest  value,  30.1,  in  S.I.C.  2813,  industrial 
gases. 

Finally,  for  our  analysis  of  the  geographic  scope  of  concentration  we  obtained  a  dataset 
of  1987  county  level  employments  for  3-digit  industries.  The  dataset  had  been  constructed 
by  filling  in  County  Business  Patterns  data  using  an  algorithm  which  consists  largely  of 
using  mean  plant  sizes  for  all  nondisclosed  employments.^^  Some  comparisons  of  this  data 
with  our  primary  dataset  are  given  at  the  end  of  Appendix  A. 


^^The  ranges  ue  those  determined  by  the  lower  bounds  of  1,  5,  10,  20,  50,  100,  250,  500,  1000,  and  2500 
employees. 

^'The  estimate  is  literally  valid  only  if  each  firm's  activities  are  divided  evenly  between  the  same  number 
of  plants. 
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See  Gardocki  and  Baj  (1985) 
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5     Basic  Results  on  Geographic  Concentration 

In  this  section  we  describe  the  patterns  of  geographic  concentration  in  U.S.  manufacturing 
industries.  We  begin  at  the  broadest  level  with  a  discussion  of  whether  any  geographic 
concentration  exists  before  moving  on  to  discuss  a  few  aspects  in  a  little  more  detail. 

5.1     Are  Industries  Geographically  Concentrated? 

The  single  most  crucial  question  one  must  ask  before  further  studying  the  geographic  con- 
centration of  industries  is  whether  geographic  concentration  really  exists.  While  a  number 
of  previous  writers  have  noted  that  localization  appears  to  be  widespread,  we  present  here 
for  the  first  time  formal  tests  of  the  more  stringent  hypothesis  that  the  extent  of  localization 
is  greater  than  that  which  would  be  expected  to  arise  randomly. 

We  begin  with  what  is  clearly  the  most  compelling  application  of  our  simple  dartboard 
model,  assuming  that  the  plants  in  an  industry  are  the  business  units  choosing  their  loca- 
tions in  an  independent  random  manner.  The  prediction  of  the  model  is  that  E{G)  =  Hp, 
with  the  difference  between  G  and  Hp  being  heteroskedastic  with  a  variance  given  in  Propo- 
sition 2.  For  the  fuU  sample  of  459  industries  we  find  that  the  means  of  G  and  Hp  are  0.77 
and  0.28,  respectively.  The  simple  dartboard  model  predicts  that  the  sample  average  of 
the  G's  should  have  a  mean  of  0.28  and  a  standard  deviation  of  0.0005,  so  this  diflference 
is  highly  significant  indicating  that  there  is  excess  localization  relative  to  random  location 
choice. 

Looking  at  the  industry-by-industry  estimates  the  prevalence  of  excess  localization 
which  previous  authors  have  noted  is  strikingly  confirmed.  The  level  of  raw  concentration 
G  exceeds  that  which  would  be  expected  to  arise  randomly  in  446  of  the  459  industries.^" 
In  fact,  the  flip  side  of  this  result  -  that  in  only  13  industries  are  plants  more  evenly  dis- 
tributed than  would  be  expected  at  random  -  is  interesting  in  that  it  indicates  that  the 
need  to  be  near  final  consumers  is  r<u-ely  an  overwhelming  force  in  location  decisions. 

Before  discussing  patterns  of  geographic  concentration  in  more  detail,  we  would  like  to 
comment  briefly  on  an  alternate  application  of  the  dartboard  model  which  might  poten- 
tially account  for  higher  levels  of  concentration.  Specifically,  one  could  apply  the  model  by 
assuming  instead  that  the  firms  in  an  industry  are  the  business  units,  with  each  choosing 
a  common  location  for  all  of  its  plants.  While  this  extreme  is  plainly  counterfactual,  it 
may  provide  a  more  reasonable  test  for  the  hypothesis  that  locations  are  random  in  some 
instances.  For  example,  for  a  number  of  years  Maytag  had  exactly  two  plants  in  which 
it  manufactured  washing  machines,  and  both  of  these  were  located  in  Newton,  Iowa.  The 


^^The  difference  between  G  and  Hp  is  larger  than  twice  its  standard  deviation  in  369  of  the  446  industries 
in  which  the  difference  is  positive,  and  none  of  the  13  industries  in  which  the  difference  is  negative. 
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two  location  decisions  were  not  independent,  and  given  that  the  entire  industry  (SIC  3633) 
consists  of  only  18  plants,  treating  them  as  independent  observations  might  lead  to  a  mis- 
leading conclusion  that  locations  are  unlike  those  expected  from  independent  random  dart 
throws. 

Looking  first  at  the  overall  level  of  concentration,  we  note  that  for  the  444  industries 
for  which  Hj  was  available  the  means  of  G  and  Hj  are  0.074  and  0.068,  respectively.  While 
the  overall  level  of  concentration  appears  to  be  approximately  that  predicted  by  the  theory, 
there  is  a  difference  between  correctly  predicting  the  overall  level  of  concentration  and 
predicting  the  pattern  of  concentration  across  industries.  A  more  demanding  test  of  the 
firm-random  location  theory  is  obtained  by  estimating  the  parameters  in  the  regression 
Gi  =  Qo  +  otifJfi  +  ft  and  testing  whether  qo  =  0  and  qi  =  1.  OLS  estimates  of  this 
equation  are  given  in  Table  1.  Each  of  the  equalities  is  rejected  individually  with  a  t- 
statistic  of  at  least  10,  and  an  F-test  of  the  joint  hypothesis  yields  an  ^2,442  statistic  of 
177.8,  also  rejecting  strongly.  The  model  thus  fails  to  account  for  pattern  of  concentration 
across  industries,  which  should  not  be  surprising  given  that  we  know  that  multiplant  firms 
do  often  choose  multiple  locations.  The  comparison  does  provide  some  intuition  for  the 
degree  to  which  manufacturing  industries  are  localized:  they  are  approximately  as  localized 
as  would  be  expected  if  units  as  large  as  firms'  operations  in  each  industry  chose  locations 
at  random. 

Table  1:  Test  of  the  Firm- Random  Location  Theory 


Equation: 
Parameter 

Gi  =  Qo  +  QiHji  + 
CoefF.  Estimate 

fi 
Std.  Error 

Constant 

0.047* 
0.394* 

0.005 
0.057 

R?  =  0.09 

*  indicates  significance  at  the  1%  level. 


5.2     Levels  of  Geographic  Concentration 

From  the  previous  section  we  know  that  the  degree  of  localization  in  U.S.  manufacturing 
industries  is  not  zero.  In  this  section  we  try  to  use  our  models  to  get  a  feel  for  how  much 
concentration  there  is.  It  seems  likely  that  the  agglomerative  forces  reflected  in  our  models 
will  vary  greatly  from  industry  to  industry.  We  therefore  begin  by  imposing  no  structure 
across  industries  and  simply  computing  the  index  7  defined  by  (2)  for  each  of  the  459  4- 
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digit  industries  in  our  sample.^^  Recall  that  7  can  be  interpreted  either  as  the  probability 
with  which  any  pair  of  plants  choose  their  locations  jointly  or  as  a  measurement  of  the 
importance  of  natural  advantage  in  location  choice.  A  complete  list  of  the  7's  we  find  is 
contained  in  Appendix  C. 

A  histogram  illustrating  the  frequency  distribution  of  these  7's  is  presented  in  Figure 
1.  In  the  figure,  each  bar  represents  the  number  of  industries  for  which  7  lies  in  an  interval 
of  width  0.01.  The  tallest  bar  is  that  corresponding  to  values  of  7  between  0  and  0.01. 
The  distribution  appears  to  be  quite  skewed,  with  a  mean  of  0.051  and  a  median  of  0.026. 
Approximately  43%  of  the  industries  have  7  <  0.02,  while  26%  have  7  >  0.05. 

How  large  are  these  values?  Recall  that  if  an  industry  has  many  equal  sized  plants  (so 
that  H  a  0),  the  natural  advantage  and  spillover  models  both  predict  that  E{G)  =  7.  A 
similar  level  of  concentration  would  result  from  completely  independent  random  location 
decisions  by  I/7  equal  sized  plants.  Hence,  for  the  118  industries  with  7  <  0.01  we  can  think 
of  agglomerative  forces  as  being  sufficiently  weak  so  that  if  not  for  the  fewness  of  plants, 
production  would  be  no  more  concentrated  than  if  it  were  scattered  at  100  equal  sized  sites. 
While  there  is  no  justification  for  any  definition  of  the  phrase  "not  very  localized,"  we  feel 
that  it  would  be  an  appropriate  description  of  such  a  pattern,  and  we  apply  it  both  to 
these  and  to  the  other  88  industries  with  7  <  0.02.  At  the  other  extreme,  we  shall  refer  to 
industries  with  7  >  0.05  as  "very  localized".  This  category  contains  119  industries. 

The  reader  should  keep  in  mind  that  if  one  views  locations  as  being  generated  by 
a  random  process,  an  individual  industry's  7  is  a  parameter  estimate  with  a  nontrivial 
standard  deviation.  To  get  a  feel  for  the  size  of  this  uncertainty  in  our  measurements,  we 
computed  standard  errors  by  simulating  a  special  case  of  our  natural  advantage  model  - 
that  of  Dirichlet-distributed  state  sizes.^^  Among  industries  with  Hp  <  0.02  the  mean  of 
the  estimated  stcindard  errors  is  0.02.  The  means  for  industries  with  Hp  in  the  ranges  0.02 
-  0.05,  0.05  -  0.10,  and  0.10  -  1.0  are  0.024,  0.041,  and  0.072,  respectively. 

To  provide  some  intuition  for  the  importance  of  accounting  properly  for  random  agglom- 
eration when  constructing  an  index  of  geographic  concentration.  Table  2  lists  the  frequency 
with  which  7/G  falls  into  a  number  of  intervals,  both  for  all  industries  and  for  the  sub- 
sample  of  those  in  the  upper  quartile  of  raw  geographic  concentration.  We  can  think  of 
the  fraction  as  a  rough  measure  of  the  portion  of  raw  concentration  which  is  legitimately 
attributable  to  some  form  of  spillovers/natural  advantage  rather  than  to  randomness.  The 
table  indicates  that  the  two  components  are  comparable  in  magnitude  and  that  there  is 


^'  Note  that  we  assume  here  and  throughout  the  rest  of  the  paper  that  the  plants  are  the  business  units 
choosing  locations. 

^^The  computation  requires  also  a  more  complete  specification  of  the  plant  size  distribution.  For  this 
purpose,  we  took  the  plant  sizes  to  be  those  used  as  an  intermediate  step  to  the  Herfindahl  calculation. 
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great  variation  in  the  mix  between  them.  In  roughly  one  third  of  the  industries  (both  over- 
all and  among  the  industries  with  high  raw  concentration)  the  fact  that  plants  are  discrete 
units  and  that  some  clusters  appear  at  random  accounts  for  at  least  as  large  a  part  of 
measured  raw  concentration  as  do  actual  agglomerations  of  plants.  Our  index  may  then 
give  a  somewhat  different  picture  of  geographic  concentration  than  would  a  discussion  of 
raw  concentrations. 

Table  2:  Raw  Concentration  Attributable  to  Spillovers/Comparative  Advantage 


Fraction  of  Industries 

with 

■lIG 

in  Range. 

Range 

All  Industries 

High 

G  Industries 

Less  than  0 

0.03 

0.03 

0.00  -  0.25 

0.09 

0.10 

0.25  -  0.50 

0.22 

0.16 

0.50  -  0.75 

0.32 

0.19 

0.75  -  1.00 

0.33 

0.53 

5.3     Patterns  of  Concentration 

An  attempt  to  explore  formally  the  industry  characteristics  which  tend  to  be  associated 
with  localization  is  well  beyond  the  scope  of  this  paper ."'^  We  would,  however,  like  to 
present  a  few  more  tables  concerning  the  geographic  concentration  in  different  industies. 

In  Table  3  we  summarize  the  levels  of  geographic  concentration  of  the  4-digit  subindus- 
tries  of  each  2-digit  manufacturing  industry.  For  each  2-digit  industry,  the  table  lists  the 
fraction  of  subindustries  which  fall  in  the  not  very  localized  (7  <  0.02),  intermediate,  and 
very  localized  (7  >  0.05)  ranges.  High  levels  of  geographic  concentration  are  most  preva- 
lent in  the  tobacco,  textile,  and  leather  industries  and  most  rare  in  the  paper,  rubber  and 
plastics,  and  fabricated  metal  products  industries. 

In  hopes  that  the  forces  affecting  geographic  concentration  might  be  clearest  at  the 
extremes,  Table  4  lists  the  15  most  and  the  15  least  localized  industries  in  terms  of  the 
index  7.  As  Krugman  (1991a)  has  previously  noted,  there  is  no  obvious  single  factor 
accounting  for  extreme  concentration.  The  most  concentrated  industry,  furs,  is  probably 
explained  both  by  the  local  transfer  of  knowledge  from  one  generation  to  the  next,  and 
as  a  response  to  buyers'  search  costs.  Furs  also  have  an  unusually  high  ratio  of  value  to 
weight.    The  next  most  concentrated,  wine,  may  be  largely  attributable  to  the  natural 


'For  interesting  work  on  this  topic  see  Henderson  (1988)  and  Enright  (1990). 
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Table  3:  Concentration  by  2-digit  Category 


2-digit  industry 


#  of  4-digit         Percent  of  4-digit  industries  with 
subindustries     7  <  0.02     7  €  [0.02,0.05]     7  >  0.05 


20.  Food  and  kindred  products 

21.  Tobacco  products 

22.  Textile  mill  products 

23.  Apparel  and  other  textile  products 

24.  Lumber  and  wood  products 

25.  Furniture  and  fixtures 

26.  Paper  and  allied  products 

27.  Printing  and  publishing 

28.  Chemicals  and  allied  products 

29.  Petroleum  and  coal  products 

30.  Rubber  and  misc.  plastics 

31.  Leather  and  leather  products 

32.  Stone,  clay,  and  glass  products 

33.  Primary  metal  industries 

34.  Fabricated  metal  products 

35.  Industrial  machinery  and  equipment 

36.  Electronic  and  other  electric  equip. 

37.  Transportation  equipment 

38.  Instruments  and  related  products 

39.  Miscellaneous  manufacturing  ind. 


49 

47 

18 

35 

4 

0 

0 

100 

23 

9 

13 

78 

31 

13 

42 

45 

17 

29 

47 

24 

13 

69 

8 

23 

17 

53 

47 

0 

14 

71 

14 

14 

31 

38 

24 

38 

5 

60 

0 

40 

15 

73 

27 

0 

11 

0 

36 

64 

26 

58 

27 

15 

26 

39 

35 

27 

38 

61 

32 

8 

51 

49 

26 

26 

37 

41 

46 

14 

18 

28 

33 

39 

17 

47 

41 

11 

18 

44 

22 

33 
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advantage  of  California  in  growing  grapes.  The  concentration  of  oilfield  machinery  (in  the 
Houston/Galveston  area)  may  be  partially  attributable  to  the  location  of  oil  production. 

The  list  of  the  15  least  concentrated  industries  is  also  something  of  a  mixed  bag.  The 
industries  certainly  do  not  stand  out  as  being  those  in  which  spreading  out  to  be  close  to 
final  consumers  is  important,  and  the  Bst  contains  several  industries,  e.g.  vacuum  cleaners 
and  small  arms  ammunition,  where  raw  concentration  is  substantial,  but  employment  turns 
out  to  be  concentrated  in  a  few  very  large  (randomly  scattered)  plants."''' 

6     Scope  of  Geographic  Concentration 

In  this  section  we  examine  two  different  aspects  of  the  scope  of  geographic  concentration. 
First,  we  discuss  the  scope  in  the  sense  of  industrial  definition,  i.e.  whether  concentration 
is  principally  a  phenomenon  which  exists  at  the  level  of  individual  industries  or  whether  it 
is  characteristic  of  broad  industry  classes  as  well.  Next,  we  discuss  the  geographic  scope  of 
concentration,  comparing  data  at  the  county,  state,  and  regional  levels. 

6.1     Industry  Definition 

Table  5  provides  a  simple  look  at  the  concentration  of  2-,  3-,  and  4-digit  industries.  While 
raw  geographic  concentration  increases  steadily  as  we  move  to  finer  industry  definitions, 
the  increase  in  7  appears  to  come  more  abruptly  as  we  move  from  the  2-digit  to  the  3-digit 
level.  This  naturally  raises  two  questions  of  scope.  Is  there  any  correlation  in  the  location 
decisions  of  firms  which  share  ordy  a  two  digit  industry  class,  or  is  the  concentration  of 
2-digit  industries  entirely  a  consequence  of  the  localization  of  its  3-digit  subindustries?  Are 
location  decisions  influenced  as  strongly  by  the  locations  of  plants  belonging  to  different  4- 
digit  industries  within  the  same  3-digit  class  as  they  are  by  the  locations  of  plants  belonging 
to  their  own  4-digit  industry?  In  this  section,  we  develop  a  framework  for  addressing  such 
questions  and  apply  it  to  our  data. 

Consider  an  industry  with  r  subindustries  having  shares  wx^w^., . . .  ,'Wr  of  the  overall 
industry  employment.  Write  H^  for  the  plant  Herfindahl  of  the  j*''  subindustry,  and  H  = 
Yl^j=i  WjH'  for  the  plant  Herfindahl  of  the  broader  industry.  Suppose  that  plants  choose 
their  locations  in  a  manner  which  is  nearly  identical  to  that  of  our  spillover  model,  but  that 
the  probability  of  any  pair  of  darts  being  welded  together  (i.e.  that  a  crucial  spillover  exists 
between  the  firms  they  represent)  being  7j  if  both  darts  correspond  to  plants  within  the 


^^In  interpreting  these  latter  cases  the  reader  should  keep  in  mind  that  the  errors  in  measuring  7  include 
both  the  inherent  uncertainty  of  analyzing  random  dart  throws  and  errors  in  filling  in  Census  nondisclosures. 
Each  of  these  components  is  larger  when  Hp  is  larger,  so  the  list  may  contain  many  industries  with  large 
Hp  simply  because  this  is  where  we  have  made  the  largest  errors  in  measurement. 
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Table  4:  Most  and  Least  Localized  Industries 


15  Most  Localized  Industries 

4-digit  industry 

Hv 

G 

7 

2371.  Fur  Goods 

0.007 

0.63 

0.63 

2084.  Wines,  Brandy,  Brandy  Spirits 

0.041 

0.50 

0.48 

2252.  Hosiery,  n.e.c. 

0.008 

0.44 

0.44 

3533.  Oil  and  Gas  Field  Machinery 

0.015 

0.44 

0.43 

2251.  Women's  Hosiery 

0.028 

0.42 

0.40 

2273.  Carpets  and  Rugs 

0.013 

0.39 

0.38 

2429.  Special  Product  Sawmills,  n.e.c. 

0.009 

0.38 

0.37 

3961.  Costume  Jewelry 

0.017 

0.33 

0.32 

2895.  Carbon  Black 

0.054 

0.34 

0.30 

3915.  Jewelers'  Materials,  Lapidary 

0.025 

0.32 

0.30 

2874.  Phosphatic  Fertilizers 

0.066 

0.34 

0.29 

2061.  Raw  Cane  Sugar 

0.038 

0.32 

0.29 

2281.  Yarn  MiUs,  Except  Wool 

0.005 

0.29 

0.28 

2034.  Dehydrated  Fruits,  Veg's,  Soups 

0.030 

0.30 

0.28 

3761.  Guided  Missiles,  Space  Vehicles 

0.046 

0.28 

0.25 

15  Least  Localized  Industries 

4-digit  industry 

^P 

G 

7 

3021.  Rubber  and  Plastics  Footwear 

0.06 

0.05 

-0.013 

2032.  Canned  Specialties 

0.03 

0.02 

-0.012 

2082.  Malt  Beverages 

0.04 

0.03 

-0.010 

3635.  Household  Vacuum  Cleaners 

0.18 

0.18 

-0.009 

3652.  Prerecorded  Records  and  Tapes 

0.04 

0.03 

-0.008 

3482.  Small  Arms  Ammunition 

0.18 

0.18 

-0.004 

3324.  Steel  Investment  Foundries 

0.04 

0.04 

-0.003 

3534.  Elevators  and  Moving  Stairways 

0.03 

0.03 

-0.001 

2052.  Cookies  and  Crackers 

0.03 

0.03 

-0.0009 

2098.  Macaroni  and  Spaghetti 

0.03 

0.03 

-0.0008 

3262.  Vitreous  China  Table,  Kitchenware 

0.13 

0.13 

-0.0006 

2035.  Pickles,  Sauces,  Salad  Dressings 

0.01 

0.01 

-0.0003 

3821.  Laboratory  Apparatus  and  Furniture 

0.02 

0.02 

-0.0002 

2062.  Cane  Sugar  Refining 

0.11 

0.11 

0.0002 

3433.  Heating  Equipment  except  Electric 

0.008 

0.009 

0.0002 
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Table  5:  Concentration  and  Industry  Definition 


Industry  means 

Industry  Definition 

Hp          G           7 

2-digit 

0.007     0.032     0.026 

3-digit 

0.014     0.058     0.045 

4  digit 

0.028     0.077     0.051 

j"*  subindustry  and  70  otherwise.  We  will  assume  that  70  <  Minj=i...r7j,  so  that  spillovers 
are  always  more  powerful  (in  expectation)  between  plants  which  are  more  similar."*^ 

Again  writing  G  for  the  raw  geographic  concentration  of  the  broader  industry  we  have 

Proposition  6  In  the  model  above, 

E{G) = ^ +70(1  -  i:«',^)+i:7i«',^(i  -  h^)- 

Proof 

Write  Tij  for  the  number  of  business  units  in  the  j*''  subindustry  and  Zji,...,  Zjn^  for  the 
sizes  of  plants  in  that  subindustry.  Writing  Ujti  for  the  Bernoulli  random  variable  indicating 
whether  the  £*^  plant  in  subindustry  j  locates  in  area  i,  the  assumption  on  welding  implies 


r      (  ^      /  7,    ifi  =  j'and^#£' 

I  70    if  J  7^  J. 


We  then  have 

(:-£x?)£(G)    =    5:Var(j,) 


=    E 


5^2^,Var(u_,/.)+      Y^     ZjtZji>Cov{ujii,Uji,i) 


=    (5:x.(l-x.)) 


^^We  require  also  that  the  welding  relation  is  again  symmetric  and  transitive.  The  assumption  that 
70  <  Min7j  ensures  that  such  a  joint  distribution  on  the  welding  probabilities  exists  so  that  the  model  is 
well  defined. 
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E{G)    =    H  +  J^f,{w]  -J^z],)  +  'roil-E^') 

i  3 

QED. 

Given  data  on  state-industry  employments  for  an  industry  and  for  each  of  its  subindus- 
tries,  raw  geographic  concentrations  may  be  computed  both  for  the  larger  industry  and 
for  the  subindustries.  Write  Gj  for  the  raw  concentration  in  the  j*''  subindustry  and  7j 
for  -^zht-  ^^  unbiased  estimate  of  the  degree  of  intersubindustry  spillovers  may  then  be 
obtained  from  the  raw  concentrations  by  setting 

.     G-n-Y.Uiiw%\-ni) 
'°=  1  -  E  ■=!  -^  • 

To  discuss  the  degree  to  which  spillovers  are  general,  we  define  a  new  measure  A  by 

Note  that  this  measure  should  be  zero  if  there  are  no  spillovers  between  plants  in  different 
subindustries  and  one  if  spillovers  are  equally  strong  regardless  of  the  subindustry  to  which 
plants  in  the  same  broad  industry  belong.^ 

There  are  97  3-digit  industires  with  more  than  one  4-digit  subindustry.  A  histogram 
for  A  on  these  three-digit  industries  is  given  in  Figure  2.  The  values  of  A  are  fairly  evenly 
spread  between  0  and  0.8,  indicating  that  there  certainly  is  some  clustering  of  similar  4-digit 
industries.  In  answer  to  our  introductory  question,  however,  it  appears  that  spillovers  are 
nearly  as  strong  across  4-digit  industries  in  the  same  3-digit  industry  as  within  the  4-digit 
industries  themselves  only  in  about  20%  of  the  cases. 

Moving  on  to  yet  broader  industry  classes.  Table  6  reports  the  results  of  the  identical 
calculation  using  the  3-digit  subindustries  of  each  2-digit  industry.  The  mean  value  of 
A  across  2-digit  industries  is  0.29.  There  is  great  variation  across  industries.  In  four 
cases  (furniture,  industrial  machinery,  electronic  and  electric  equipment,  and  transportation 
equipment)  the  data  indicate  that  there  is  no  concentration  at  all  at  the  2-digit  level.  On 
the  other  hand,  there  is  substantial  concentration  of  the  3-digit  industries  within  thc^-di^t 
tobacco,  textile,  and  lumber  industries. 

We  should  note  while  we  have  tended  to  use  the  word  "spiUovers"  in  this  section,  several 
factors  may  explain  the  results.  Technology  or  knowledge  spillovers  are  one  possibility,  but 


^^Of  course,  A  is  a  random  variable  so  these  statements  apply  literally  only  to  ^^-^ ■  Note  also  that 

X  is  not  an  unbiased  estimate  of  this  expression.    An  earlier  version  of  this  paper  defined  A  by  a  linear 
interpolation  which  in  practice  yields  values  almost  identical  to  those  we  report  here. 
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Figure  2:  Extent  of  Spillovers  between  4-(iigit  Industries 


Histogram  of  Lambda 

3-Digit  Industries 


Lambda 
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spatially  correlated  natural  advantages  or  other  "spillovers"  like  minimizing  transportation 
costs  given  intersubindustry  trade  could  also  account  for  positive  values  of  A.  While  a  more 
detailed  analysis  is  clearly  called  for,  the  list  of  2-digit  industries  where  A  is  largest  suggests 
that  we  may  be  detecting  in  large  part  that  the  "natural  advantages"  which  are  important 
to  the  subindustries  are  similar. 

Table  6:  Extent  of  Spillovers  between  3-digit  Industries 


2-digit  industry 


7o  A     2-digit  industry 


To 


Food  and  kindred  products 

0.002 

0.14 

Tobacco  products 

0.151 

0.88 

Textile  mill  products 

0.115 

0.61 

Apparel  and  other  textiles 

0.010 

0.29 

Lumber  and  wood  products 

0.016 

0.63 

Furniture  and  fixtures 

0.001 

0.02 

Paper  and  allied  products 

0.005 

0.31 

Printing  and  publishing 

0.005 

0.48 

Chemicals  and  allied  products 

0.007 

0.25 

Petroleum  and  coal  products 

0.007 

0.12 

Rubber  and  misc.  plastics 
Leather  and  leather  products 
Stone,  clay,  and  glass  products 
Primary  metal  industries 
Fabricated  metal  products 
Industrial  machinery  and  equip. 
Electronic  &  other  electric  equip. 
Transportation  equipment 
Instruments  and  related  products 
Miscellaneous  manufacturing 


0.003 

0.38 

0.017 

0.31 

0.002 

0.20 

0.012 

0.41 

0.003 

0.22 

0.000 

0.00 

0.000 

0.02 

0.001 

-0.08 

0.013 

0.36 

0.011 

0.34 

6.2     Geographic  Scope  of  Concentration 

In  Section  3,  we  noted  that  the  7's  estimated  from  county-,  state-,  or  region-level  data 
should  be  identical  (in  expectation)  provided  the  scope  of  spillovers  is  such  that  advantages 
are  gained  only  if  firms  choose  identical  locations.  If  on  the  other  hand  the  effect  of  spillovers 
(or  the  spatial  correlation  of  natural  advantage)  is  smoothly  declining  with  distance,  then 
those  7's  will  reflect  the  excess  probability  with  which  pairs  of  firms  tend  to  locate  in 
the  same  county,  state,  and  region,  respectively.  To  investigate  the  geographic  scope  of 
spillovers  we  estimated  7's  from  our  county/3-digit  dataset  using  counties,  states,  and  the 
nine  census  re^ons  as  the  units  of  observation. 

Figure  3  presents  histograms  of  the  7's  estimated  from  the  three  levels  of  data.  Compar- 
ing first  the  county-  and  state-level  estimates  note  that  substantially  more  concentration  is 
apparent  at  the  state  level.  The  median  7's  at  the  two  levels  are  0.005  and  0.023,  with  the 
median  of  the  ratio  between  them  being  0.25,  so  that  typically  the  effect  of  spiUovers  is  such 
that  about  one  fourth  of  the  excess  tendency  of  plants  to  locate  in  the  same  state  involves 
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Figure  3:  Concentration  at  the  County,  State,  and  Regional  Level 
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plants  locating  in  the  same  county.  We  draw  two  conclusions.  First,  given  that  states 
have  many  more  than  4  counties,  spillovers  appear  to  have  a  stronger  effect  at  very  small 
distances.  Second,  spillovers  are  still  quite  substantial  at  a  range  beyond  that  of  counties. 
In  only  a  few  cases  do  spillovers  appear  both  to  be  substantial  and  limited  in  scope  to  the 
county  level. ^'^  The  rubber  and  plastics  footwear  industry  seems  to  be  the  unique  example 
where  concentration  is  substantially  greater  at  the  county  level  than  at  the  state  level,  i.e. 
where  tightly  grouped  clusters  of  plants  are  spread  (excessively)  evenly  across  the  states  as 
if  to  minimize  transportation  costs. 

Measured  levels  of  state  and  regional  concentration  are  more  similar,  although  the 
regional  data  shows  a  much  thicker  tail  of  very  concentrated  industries.  (The  mean  7's  are 
0.044  and  0.078.)  The  general  pattern  of  slightly  more  than  half  of  the  tendency  of  firms 
to  locate  in  the  same  region  being  accounted  for  by  the  tendency  to  locate  in  the  same 
state  appears  to  hold  equally  well  for  industries  which  are  very  unconcentrated  and  very 
concentrated  at  the  state  level,  although  there  is  considerable  variation  about  this  norm.^ 

7     Geographic  Concentration  within  the  Firm 

In  this  section  we  investigate  the  tendency  to  locate  together  of  plants  belonging  to  the  same 
firm.  The  issue  is  interesting  not  only  in  its  own  right,  but  also  in  that  such  a  tendency 
might  account  for  a  significant  portion  of  the  localization  we  have  identified. 

To  analyze  the  potential  for  measuring  agglomeration  within  the  firm,  we  consider  an 
industry  consisting  of  r  firms  with  shares  uji,u?2,.  ..,«;,  of  the  industry's  employment. 
Suppose  that  firm  j  consists  of  Uj  plants  having  shares  Zji, . . .  ,zjnj  of  the  industry's  em- 
ployment. Assume  that  the  location  choices  of  the  plants  are  again  made  as  in  our  spillover 
model  with  the  probability  of  a  pair  of  darts  being  welded  being  equal  to  70  if  they  corre- 
spond to  plants  in  different  firms  and  71  >  70  if  they  belong  to  the  same  firm.  The  model 
is  thus  a  special  case  of  the  model  of  Section  6.1  with  the  firms  analogous  to  subindustries 
and  the  expected  degree  of  spillovers  within  each  firm  assumed  to  be  identical.  A  direct 
corollary  of  Proposition  6  is 

Proposition  7  In  the  model  above, 

E{G)  =  ^p  +  70(1  -  Hj)  +  7i(^/  -  Hp)- 


^^The  most  notable  cases  ate  fur  goods,  building  paper  and  board  mills,  and  periodicab. 

^'industries  notable  for  nnnsually  high  (relative)  regional  concentration  include  ordnance  and  accessories, 
nonferrous  foundries,  and  cigarettes.  Industries  in  which  state-level  clusters  are  unusually  dispersed  include 
photographic  equipment  and  supplies,  radio  and  television  receiving  equipment,  and  periodicals. 
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In  trying  to  apply  the  prediction  of  this  model  to  recover  71 ,  a  great  obstacle  arises  - 
state-firm  employments  are  much  harder  to  find  than  state-industry  employments.  As  a 
result,  we  cannot  separately  estimate  70  and  71  for  a  single  industry.  What  we  try  to  do 
instead  is  to  identify  average  values  of  70  and  71  using  cross-industry  variation.  Specifically, 
we  note  that  if  one  makes  the  heroic  assumption  that  the  parameters  70,  and  71,  for  industry 
I  are  random  variables  whose  conditional  means  are  independent  of  Hpi  and  Hji,  then  the 
coefficients  qq  and  di  from  the  OLS  regression 

Gi  -  Hpi  =  Qo(l  -  Hpi)  +  aiiHfi  -  Hpi)  +  e. 

are  consistent  for  £^(70),  and  E{'yi). 

We  estimated  the  regression  above  for  our  sample  of  444  4-digit  industries.  The  parame- 
ter estimates  are  qo  and  di  are  0.046  (s.e.  0.005)  and  0.068  (s.e.  0.067),  respectively.  While 
the  first  coefficient  estimate  is  highly  significant,  the  second  is  quite  imprecise.  Hence,  while 
the  point  estimate  is  that  plants  belonging  to  the  same  firm  are  slightly  more  agglomerated 
than  other  plants  in  the  same  industry,  we  can  not  rule  out  a  substantially  higher  level  of 
intrafirm  agglomeration.  Given  that  the  mean  of  Hj  —  Hp  is  only  0.04,  we  can  say  fairly 
confidently  that  only  a  very  small  portion  of  total  geographic  concentration  is  attributable 
to  intrafirm  agglomerations. 

As  a  simple  specification  test  for  this  model,  we  estimated  also  the  unconstrained  re- 
gression 

and  performed  a  Wald  test  of  the  restriction  l3o  +  /3i  +  02  =  1  •  The  test  does  reject  the 
specification  at  the  5%  level,  although  we  note  that  the  test  would  no  longer  reject  if  we 
made  the  minor  bias  correction  suggested  by  Appendix  B.  While  we  believe  the  results  of 
this  section  are  of  interest,  we  thus  admit  that  they  dearly  should  be  interpreted  with  some 
caution. 

8     Conclusion 

In  this  paper  we  have  developed  a  new  framework  for  the  analysis  of  geographic  concen- 
tration based  on  a  dart  throwing  metaphor.  Using  a  series  of  very  simple  models,  we 
obtain  characterizations  of  both  "random"  agglomeration  and  of  agglomeration  caused  by 
spillovers  and  natural  advantage.  Our  most  important  theoretical  result  is  that  it  is  possible 
to  control  for  industry  characteristics  in  a  fairly  robust  manner  when  measuring  geographic 
concentration.  This  leads  us  to  propose  two  new  "natural"  indices,  7  and  A,  to  measure 
the  localization  of  industries  and  the  relative  strength  of  cross-industry  agglomerations. 
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The  empirical  work  of  this  paper  is  largely  descriptive.  Besides  reafiiming  that  localiza- 
tion is  ubiquitous,  we  have  tried  to  develop  rough  sketches  of  the  variation  of  localization 
across  industries,  the  geographic  and  industry  levels  at  which  it  is  most  prominent,  and  its 
relationship  to  the  structure  of  multiplant  firms. 

The  existence  of  geographic  concentration  has  attracted  the  attention  of  researchers  in 
many  fields  and  many  potential  explanations  have  been  proposed.  In  the  future,  we  hope 
that  our  measurement  techniques  will  prove  useful  both  in  descriptive  work  on  the  nature  of 
geographic  concentration  and  in  attempts  to  use  the  facts  uncovered  to  assess  the  relative 
importance  of  various  agglomerative  forces. 
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Appendix  A 

This  appendix  describes  the  process  by  which  state-industry  employment  figures  were 
constructed.  The  1987  Census  of  Manufactures  reports  the  employment  or  a  range  of 
employments  for  all  state-industries  with  at  least  150  employees.  Table  7  indicates  the 
number  of  these  state-industries  for  which  data  is  categorized,  the  number  of  these  which 
are  topcoded  at  2500  or  more  employees,  and  the  average  across  industries  of  the  fraction 
of  employees  whose  state  cannot  be  determined  simply  by  assigning  each  state  its  minimum 
possible  employment. 

Table  7:  Extent  of  Withheld  Data 


Industry  Definition 

2-digit 

3- digit     4-digit 

#  Industries 

21 

141          460 

#  Cells  with  Ranges 

153 

1776        5700 

#  Top codes 

46 

268          487 

Avg.  Employment  Fraction 

0.02 

0.11         0.20 

Before  beginning  to  fill  in  the  data,  we  first  adjust  the  upper  or  lower  bounds  on  any  2- 
or  3-digit  state  industry  for  which  a  sharper  bound  can  be  obtained  by  summing  the  upper 
or  lower  bounds  of  the  subindustries  which  comprise  it.  This  reduces  the  number  of  2-  and 
3-digit  state  industries  without  upper  bounds  to  13  and  157,  respectively.  In  addition,  a 
total  of  82  and  680  bounds  are  tightened  on  cells  where  a  non-topcoded  range  had  been 
given. 

The  filling  process  begins  with  the  21x51  matrix  of  2-digit  data.  First,  a  rough  estimate 
of  the  total  employment  in  cells  which  are  reported  as  zero  is  made  for  each  state  and  for 
each  industry.  The  estimate  is  simply  35  times  the  number  of  missing  firms  with  20  or  more 
employess  plus  6  times  the  number  of  missing  firms  with  fewer  than  20  employees,  provided 
that  this  total  is  less  than  150  times  the  number  of  empty  cells  in  the  appropriate  row  or 
column.  (Each  of  these  estimates  is  less  than  600  employees). 

The  main  part  of  the  algorithm  assigns  values  within  the  ^ven  range  to  each  cell,  trying 
to  do  so  in  a  manner  that  makes  the  sums  of  the  rows  and  columns  as  close  as  possible  to 
those  indicated  by  the  reported  totals  for  employment  in  each  industry  and  manufacturing 
employment  in  each  state.  While  this  could  be  treated  as  a  large  optimization  problem  with 
a  number  of  variables  equal  to  the  number  of  categorized  state-industry  employments,  this 
approach  was  deemed  intractable  and  instead  an  admittedly  ad  hoc  procedure  was  used  to 
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sequentially  fill  in  cells.  Essentially,  the  procedure  repeatedly  looks  at  the  matrix  of  data, 
identifies  the  categorized  cells  for  which  there  is  the  least  uncertainty  as  to  employment,  fills 
in  employment  of  those  cells,  and  again  looks  at  the  matrix  in  which  the  filled  in  numbers 
are  accepted  as  fact. 

The  process  of  identifying  which  cells  to  fill  in  follows  a  set  of  priorities.  First,  if  there 
are  any  rows  or  columns  for  which  all  categorized  cells  must  be  set  to  the  minimum  or 
maximum  to  satisfy  adding  up  constraints  those  cells  are  chosen.  Next,  the  algorithm  looks 
for  rows  or  columns  in  which  only  a  single  element  is  unknown.  If  all  rows  and  columns 
have  multiple  unknown  cells,  the  algorithm  selects  the  row  or  column  in  which  there  is 
the  lejist  varitince  possible  within  the  unknown  ranges.  The  manner  in  which  this  is  done 
usually  results  in  topcodes  not  being  filled  in  until  virtually  all  active  rows  and  columns 
contain  a  topcode,  and  rows/columns  with  multiple  topcodes  not  being  filled  until  there  are 
no  rows/columns  with  a  single  topcode  remaining.  When  filling  cells  in  a  row  with  multiple 
unknown  elements,  the  algorithm  looks  at  the  departures  from  expected  employment  in  the 
row  and  column  of  each  unknown  cell  and  adjusts  the  cells  in  a  direction  calculated  loosely 
on  the  analogy  of  calculating  conditional  means  of  normal  random  variables.  The  amount 
by  which  a  cell  is  adjusted  is  limited  by  the  constraint  that  its  row/column  must  be  able 
to  sum  as  well. 

After  filling  the  2-digit  data,  the  process  is  repeated  on  the  3-  and  4-  digit  data,  the  only 
difference  being  that  instead  of  using  the  constraint  that  the  state-industry  employments 
should  add  up  to  the  state  total  manufacturing  employment  we  use  the  set  of  constraints 
dictated,  for  excunple,  by  employment  within  each  state  in  the  3-digit  subindustries  of  a 
2-digit  industry  adding  up  to  the  employment  in  that  state  in  the  2-digit  industry. 

In  addition,  the  previously  estimated  state  and  industry  total  employments  in  states 
whose  employments  are  reported  as  zero  are  allocated  across  state-industries  by  an  algorith- 
m  identical  to  that  described  above.  In  the  4-digit  data,  these  rounded-to-zero  employments 
are  occasionally  a  nontrivial  fraction  of  the  total  employment  in  an  idustry. 

While  there  is  no  way  to  tell  that  this  algorithm  is  doing  well,  it  is  at  least  possible  to 
tell  that  it  is  doing  badly  to  the  extent  that  the  algorithm  is  unable  to  make  the  state  or 
industry  totals  add  up  (although  due  to  rounding  errors  totals  are  off  by  up  to  400  employees 
in  industries  where  no  data  is  withheld).  Of  the  21  2-digit  industries,  the  maximum  error 
in  the  adding  up  constraints  is  508  employees  with  all  other  industries  within  400.  In  the 
3-digit  industries  and  4-digit  industries  there  are  two  and  six  industries  where  the  error  is 
greater  than  400,  with  two  4-digit  industries  having  errors  greater  than  1000  employees, 
the  maximum  being  2010  (although  these  two  are  very  big  industries).  The  average  error 
in  the  state  adding  up  constraints  are  31,  177,  and  558  at  the  2-,  3-  and  4-digit  level.  In 
all  but  one  of  the  2-digit  industries  and  in  all  but  6  of  the  3-digit  industries  it  was  never 
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necessary  to  fill  in  multiple  top  codes  at  the  same  time. 

We  would  have  liked  to  simulate  a  data  withholding  process  to  provide  rough  estimates 
of  the  bias  and  variance  of  measurement  error  on  the  raw  geographic  concentration  measure 
G  induced  by  our  data  filling.  However,  the  Census's  withholding  process  is  not  sufRcienty 
transparent  that  we  felt  confident  that  we  could  reasonably  simulate  it.  Absent  that,  we 
present  here  a  small  test  of  the  accuracy  of  our  procedure  based  on  data  obtained  separately 
from  the  County  Business  Patterns  for  the  area  where  our  procedure  is  most  suspect,  filling 
in  topcodes  in  the  4-digit  data. 

Data  was  available  from  County  Business  Patterns  on  state-industry  employment  for 
171  of  the  487  4-digit  state-industries  where  employment  was  topcoded  at  2500  or  more. 
The  CBP's  sample  differs  somewhat  from  the  Census  of  Manufactures,  and  as  a  result 
the  CBP  reported  employment  is  below  2500  in  30  of  these  state-industries.  We  dropped 
these  state-industries  from  our  test.  (We  chose  not  to  use  CBP  data  as  an  input  to  our 
algorithm  precisely  because  it  is  often  incompatible  with  range  and  adding  up  constraints 
in  the  Census  of  Manufactures  data.)  Of  the  remaining  141  state-industries,  four  have 
very  large  employments  and  in  each  case  our  data  fit  extremely  well,  giving  our  estimates 
a  misleadingly  high  0.98  correlation  with  the  CBP  data.  Across  the  remaining  137  state- 
industries  the  mean  and  standard  deviation  of  employment  are  virtUtiUy  identical  in  our 
data  and  in  the  CBP  data  and  the  correlation  between  the  two  is  0.74.  (The  means  are 
5329  and  5304,  the  standard  deviations  3451  and  3306.)  For  comparison,  if  the  Census  of 
Manufactures  had  reported  ranges  for  this  data  using  the  CBP  ranges  (2500-4999,  5000- 
10000,  and  10000-20000)  and  we  had  constructed  estimates  simply  by  filling  in  the  mean 
of  the  appropriate  range,  the  correlation  coefficient  would  be  higher  (0.93),  but  the  sample 
means  and  variance  would  be  much  further  from  those  of  the  CBP  data.  (The  mean  would 
be  5939,  and  the  standard  deviation  4314). 

While  the  results  above  suggest  that  our  procedure  has  some  accuracy  in  filling,  the 
most  important  question  is  clearly  what  implications  errors  in  assigning  state-employments 
have  on  the  computation  of  G.  Even  a  procedure  which  is  quite  inaccurate  might  yield 
reasonable  estimates  of  G  if  it  simply  assigns  clusters  of  employment  to  the  wrong  states. 
As  a  rough  estimate  of  the  effect  of  that  our  filling  in  of  topcodes  has  on  the  computation 
of  G,  we  constructed  a  measure  of  Gd>j)  by  substituting  the  CBP  employment  totals  for 
our  fiUed  in  employment  totals  for  all  topcoded  cells  in  the  61  industries  where  the  CBP 
data  allowed  all  topcodes  to  be  filled  in  (and  where  there  was  at  least  one  topcode).  For 
this  purpose  we  took  the  CBP  data  to  report  an  employment  of  2500  whenever  it  actually 
reported  a  smaller  number.  Comparing  our  previously  estimated  G  with  the  value  G^ypi 
we  find  that  the  means  are  0.054  and  0.050,  with  correlation  of  0.96.  The  absolute  value 
of  the  difference  between  the  two  has  a  median  of  0.0015,  with  the  value  being  larger  than 
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0.005  in  11  of  the  61  industries.  While  this  suggests  that  our  filling  of  topcodes  does  not 
induce  significant  bias  or  large  measurement  errors,  we  should  point  out  that  the  industries 
in  which  this  test  was  performed  may  have  been  among  the  easier  industries  with  topcodes 
to  fill  because  they  tended  to  have  fewer  topcodes  than  the  average  industry  with  at  least 
one  topcode  (1.5  vs.  2.5).  On  the  other  hand,  the  majority  of  4-digit  industries  have  no 
topcoded  cells  to  begin  with.  Also,  while  we  the  filled  topcodes  would  appear  to  be  the 
greatest  potential  problem  with  our  algorithm,  this  test  says  nothing  about  biases  due  to 
the  filling  of  nontopcoded  ranges  and  of  state-industry  employments  of  less  than  150. 

For  another  look  at  the  sensitivity  of  measured  levels  of  concentration  to  the  way  in 
which  we  filled  in  the  data,  we  compared  the  values  of  G  obtained  from  a  state/3-digit 
industry  calculations  with  our  standard  dataset  and  with  state  totals  from  our  county-level 
dataset.  (Recall  that  this  latter  dataset  had  been  constructed  entirely  from  CBP  data 
using  mean  establishment  sizes  to  fill  in  missing  values.)  Because  the  latter  dataset  is  not 
based  on  the  1987  SIC  revision,  the  comparisons  below  involve  only  the  96  SIC  codes  whose 
definitions  were  unchanged.  The  values  of  G  from  the  two  data  sources  differ  (in  absolute 
value)  by  less  than  0.005  in  59  of  the  96  industries.  The  difference  is  between  0.01  and 
0.02  in  14  industries,  and  greater  than  0.02  in  eight.  In  several  of  these  cases,  however, 
the  values  of  G  are  quite  large  so  that  we  may  regard  the  two  datasets  as  giving  roughly 
similar  measurements.  The  differences  are  both  larger  than  0.015  and  larger  than  20%  of 
the  larger  G  for  only  eight  SIC  codes:  213,  281,  302,  315,  321,  375,  386,  and  387.  The  data 
for  these  industries  should  perhaps  be  treated  with  some  caution. 
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Appendix  B 

This  appendix  discusses  the  manner  in  which  the  variable  Hp  was  constructed  from  the 
Census  data  and  the  potential  implications  for  our  measurements  of  geographic  concentra- 
tion. Given  that  a  significant  amount  of  information  about  the  distribution  of  plant  shares 
within  each  industry  is  available,  we  have  chosen  to  construct  Hp  by  a  procedure  which 
is  much  more  akin  to  filling  in  data  than  to  imposing  any  distributional  assumptions  and 
estimating  parameters,  and  which  therefore  will  admittedly  be  ad  hoc.  The  algorithm  has 
two  main  steps:  the  first  consisting  of  allocating  employees  across  size  classes  to  obtain  a 
regular  data  structure,  and  the  second  of  computing  an  expected  sum  of  squares  for  the 
plants  within  each  class  using  a  rule  of  thumb  recommeded  by  Schmalensee  (1977). 

In  316  of  the  459  industries  the  Census  Bureau  has  withheld  data  on  the  total  employ- 
ment within  a  size  class  (typically  one  with  three  or  fewer  plants.)  In  this  case,  the  Census 
data  instead  contain  the  combined  employment  in  this  class  and  <uiother  indicated  class. 
To  perform  a  rough  separation  of  the  employment  in  combined  classes,  we  first  for  each  size 
class  used  the  sample  of  industrues  for  which  the  total  employment  is  reported  to  estimate 
the  mean  and  variance  of  employment/plant  as  a  function  of  the  number  of  plants  in  the 
class.  (The  mean  was  assumed  to  be  of  the  form  cq  +  ailog{l  +  n)  and  the  variance  of 
the  form  bo  +  6i(l/n),  with  the  parameters  estimated  by  OLS  regressions.)  Employment 
in  each  of  the  combined  clcisses  were  then  set  so  that  departures  from  the  predicted  means 
were  inversely  proportional  to  the  predicted  variances,  provided  that  this  did  not  violate 
the  upper  and  lower  bounds  on  plant  size. 

The  second  step  procedure  essentially  consists  of  assuming  that  the  sizes  of  the  plants 
within  each  class  are  discretely  uniformly  spread  on  a  range  centered  on  the  the  mean  and 
with  its  boundary  at  the  closer  of  the  two  endpoints  of  the  size  range.  Hp  is  estimated 
simply  by  taking  the  sum  of  the  squares  of  the  plant  shares  for  this  particular  allocation  of 
employees  across  plants.  Schmalensee  (1977)  reports  that  this  assumption  of  linear  shares 
within  a  class  seems  to  give  the  best  estimates  of  the  Herfindahl  index  in  a  similar  problem. 

We  do  not  regard  this  procedure  as  an  attempt  to  assign  employments  to  plants,  but 
just  as  a  complicated  function  which  approximates  the  Herfindahl  index  given  the  available 
data.  To  assess  the  accuracy  of  this  procedure,  we  constructed  a  simulated  dataset  of 
5000  industrues.  The  simulated  industries  were  created  by  assuming  that  the  plant  sizes 
in  industry  t  consist  of  n,-  draws  from  a  lognormal  distribution  with  mean  /x,-  and  standard 
deviation  a,-.  The  parameters  n,-,/x,,  and  a,-  were  themselves  realizations  of  independent 
lognormal  random  variables  with  means  (standard  deviations)  527  (1106),  143  (286),  and 
287  (2101),  respectively.  These  parameters  were  obtained  from  sample  statistics  (and  the 
estimated  Hp)  of  our  459  industry  sample.  The  data  produced  by  the  simulations  bears  a 
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superficial  resemblance  to  the  actual  data,  although  it  tends  to  contain  far  more  extreme 
outliers  {e.g.  industries  with  over  95%  of  employment  in  a  single  plant.)  We  created  a 
simulated  dataset  modified  to  preserve  confidentiality  by  combining  employment  in  any 
size  class  with  two  or  fewer  plants  with  the  employment  in  the  next  lower  nonempty  size 
class.  This  modification  involved  withholding  data  in  3200  of  the  5000  industries. 

We  applied  our  algorithm  to  this  dataset  to  produce  estimated  plant  Herfindahls,  Hp, 
and  compared  these  to  the  true  Hp.  On  average  the  estimated  Herfindahls  were  slightly 
smaller  than  the  true  values,  the  ratio  of  the  means  being  1.05.  Our  principal  use  for 
estimates  of  Hp  in  the  paper  is  as  a  part  of  the  computation  of  7  for  each  industry.  Note 
that  if  we  set  7  =  (G  -  Hp)/{1  -  Hp),  where  G  =  70  +  (1  -  7o)^p  +  e  with  E{€\Hp,  Hp)  =  0 
then 

E{^  -  lolHp)  =  (1  -  ^o)E{^^\Hp). 

1  —  lip 

Hence,  if  E{Hp\Hp)  =  Hp,  then  our  estimates  of  70  will  be  unbiased. 

One  cannot  estimate  E(Hp\np)  without  making  assumptions  about  the  distribution  of 

Hp.  While  our  simulated  Hp''s  do  not  match  the  observed  distribution  of  plant  Herfindahls, 

we  hope  that  they  will  at  least  provide  results  which  are  indicative  of  the  magnitude  of  the 

bias  our  procedure  produces.  Over  our  5000  industry  sample,  a  OLS  regression  of  Hp  on 

Hp  yields  an  estimated  constant  of  0.0003  (t-stat:  1.3),  with  the  estimated  coefficient  on 

Hp  being  1.04  (t-stat:  228.9).  Restricting  the  regression  to  the  observations  with  Hp  <  0.3 

to  eliminate  the  effect  of  unreasonable  industries  gives  estimates  of  0.0001  (t-stat:  0.5)  and 

1.05  (t-stat:  173.8).  Adding  a  quadratic  term  to  this  regression  we  find  the  coefficient  to  be 

insignificant,  suggesting  that  nonlinearity  is  not  a  problem.  Regressing  the  squared  error 

-  2 
from  the  linear  regression  on  a  constant,  Hp,  and  Hp    to  get  an  idea  of  the  magnitude  of 

the  measurement  error  in  a  typical  industry  gives  the  estimate  tx'^  =  0.00003  -I-  0.003^p  -|- 

0.007ffp^ 

If  we  believe  these  results,  then  for  a  typical  industry  with  70  small  we  will  underestimate 

7o  by  about  0.05^p.  Given  that  the  mean  of  Hp  is  less  than  0.03,  this  bias  is  fairly  small. 

To  correct  this  bias,  one  could  simply  multiply  all  of  our  previous  estimates  of  Hp  by  1.05. 

The  correction  is  not  large,  however,  and  given  that  we  have  limited  confidence  in  the 

simulations  we  decided  not  to  impose  it. 
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Appendix  C 


Industry 
201 1  Meat  packing  plants 
2013  Sausagas  and  othar  prsparsd  maats 
2015  Poultry  slaughtanng  and  procassing 

2021  Craamary  buttsr 

2022  Chaata.  natural  and  proeassad 

2023  Ory.  condantad.  and  avaporatad  dairy  products 

2024  lea  craam  and  frozan  dasaarts 
2026  Fluid  milk 

2032  Cannad  tpscialtiaa 

2033  Cannad  fruits  and  vagatablas 

2034  Oahydratad  fruits,  vagatablas.  and  soups 

2035  Picklai.  saucas.  and  salad  drassings 
2037  Frozan  fruits  and  vagatablas 

203B  Frozan  spacialtias.  n.a.c. 

204 1  Flour  and  othar  grain  mill  products 

2043  Caraal  braakfast  foods 

2044  Rica  milling 

2045  Praparad  flour  mixas  and  doughs 

2046  Wat  com  milling 

2047  Dog  and  cat  food 

2048  Praparad  faads.  n.a.c. 

2051  Braad.  caka.  and  ralatad  products 

2052  Cooklas  and  erackars 

2053  Frozan  bakary  products,  axcapt  braad 

2061  Raw  eana  sugar 

2062  Car>a  sugar  rafining 

2063  Baat  sugar 

2064  Candy  and  othar  confactionary  products 

2066  Chocolats  and  cocoa  products 

2067  Cha«rir\g  gum 

2066  SaHad  and  roastad  nuts  and  saads 

2074  Coltonsaad  oil  miHs 

2075  Soybsan  oil  miHs 

2076  Vagatabis  oil  miHs.  n.a.c. 

2077  Animal  and  marina  tats  and  oils 
2079  Edibia  fats  and  oils,  n.a.c. 

2062  Malt  bavaragas 

2063  Malt 

2084  WInas.  brandy,  and  brmndy  spirtti 

2065  OisUHad  and  blandad  Iquera 

2086  BoWad  and  cannad  soft  drink* 

2087  Flavoring  axTacti  and  aynipa,  n.a.c. 

2091  Cannad  and  eurad  Hah  and  saafood* 

2092  Fraah  or  irann  praparad  flah 
2096  Roaatad  coliaa 

2096  Potato  chipa  and  aimHar  snack* 

2097  Manufaeturad  lea 
2096  Macaroni  artd  spaghaM 
2099  Food  praparattona,  n.a.c. 
2111  Cigaranas 

2121  Cigar* 

2131  Chawing  artd  smoking  tobacco 


Employmant 

Plant  Harfindahl 

Gamma 

113.9 

0.008 

0.042 

78.7 

0.004 

0.006 

147.9 

0.006 

0.055 

1.7 

0.045 

0.147 

33.0 

0.009 

0.131 

14.1 

0.056 

0.015 

20.3 

0.006 

0.001 

72.4 

0.002 

0.002 

24.S 

0.032 

-0.012 

66.1 

0.006 

0.044 

10.1 

0.030 

0.260 

21.4 

0.013 

0.000 

49.8 

0.011 

0.079 

37.5 

0.015 

0.002 

13.3 

0.009 

0.019 

16.0 

0.054 

0.016 

4.5 

0.063 

0.136 

12.1 

0.020 

0.015 

8.8 

0.060 

0.137 

13.4 

0.018 

0.011 

34.5 

0.002 

0.019 

161.9 

0.003 

0.001 

45.3 

0.028 

-0.001 

9.9 

0.035 

0.013 

6.2 

0.038 

0.289 

5.5 

0.107 

0.000 

7.9 

0.031 

0.074 

45.8 

0.012 

0.046 

11.0 

0.107 

0.038 

5.2 

0.157 

0.072 

8.8 

0.079 

0.025 

2.6 

0.032 

0.166 

7.0 

0.020 

0.070 

0.9 

0.084 

0.049 

9.8 

0.009 

0.010 

9.3 

0.021 

0.031 

31.9 

0.042 

-0.010 

1.4 

0.072 

0.238 

13.9 

0.041 

0.480 

9.0 

0.035 

0.079 

96.8 

0.002 

0.006 

9.1 

0.018 

0.025 

6.7 

0.020 

0.081 

38.2 

0.007 

0.069 

10.7 

0.026 

0.032 

33.1 

0.011 

0.006 

4.7 

0.006 

0.012 

8.6 

0.028 

-0.001 

58.0 

0.009 

0.013 

32.0 

0.223 

0.168 

2.5 

0.107 

0.158 

3.3 

0.083 

0.200 

41 


Industry 
2141  Tobacco  stemming  and  redrying 
221 1  Broadwovan  fabric  mills,  cotton 
2221  Broadwovan  fabric  mills,  manmade  fiber  and  silk 
2231  Broadwoven  fabric  mills,  wool 
2241  Narrow  fabric  mills 

2251  Woman's  hosiery,  except  socks 

2252  Hosiery,  n.e.c. 

2253  Knit  outerwear  mills 

2254  Knit  underwear  mills 

2257  Weft  knit  fabric  mills 

2258  Lace  and  warp  knit  fabric  mills 

2259  Knitting  mills,  n.e.c. 

2261  Finishing  plants,  cotton 

2262  Finishing  plants,  manmade 
2269  Finishing  plants,  n.e.c. 
2273  Carpets  and  rugs 

2281  Yam  spinning  mills 

2282  Throwing  and  winding  mills 
2284  Thread  mills 

2295  Coated  fabrics,  not  rubberized 

2296  Tire  cord  and  fabrics 

2297  Nonwoven  fabrics 

2298  Cordage  and  twine 

2299  Textile  goods,  n.e.c. 

231 1  Men's  and  boys'  suits  and  coats 

2321  Men's  and  boys'  shirts 

2322  Men's  and  boys'  undanwaar  and  nightwaar 

2323  Man's  and  boys'  neckwear 

2325  Men's  and  boys'  trousers  and  slacks 

2326  Men's  and  boys'  work  clothing 
2329  Men's  and  boys'  clothing,  n.e.c. 

2331  Women's,  misses',  and  juniors'  blouses  and  shirts 
2335  Women's,  misses',  and  juniors'  dresses 
2337  Women's,  misses',  and  juniors'  suits  and  coats 
2339  Women's,  misses',  and  juniors'  outerwear,  n.e.c. 

2341  Women's  and  children's  underwear 

2342  Brassieres,  girdles,  and  allied  garments 
2353  Hats,  caps,  and  millinery 

2361  Girls'  and  children's  dresses  and  blouses 
2369  Girls'  and  children's  outerwear,  n.e.c. 
2371  Fur  goods 
2381  Fabric  dress  and  work  gk>ves 

2384  Robes  and  dressing  gowns 

2385  Waterproof  outerwear 

2386  Leather  and  sheep-lined  clothing 

2387  Apparel  belts 

2389  Apparel  and  accessories,  n.e.c. 

2391  Curtains  and  draperies 

2392  Housefumishings,  n.e.c. 

2393  Textile  bags 

2394  Canvas  and  related  products 

2395  Pleating  and  stitching 


Employment 

Plant  Herfindahl 

Gamma 

6.9 

0.045 

0.178 

72.3 

0.025 

0.170 

88.3 

0.007 

0.228 

14.0 

0.042 

0.087 

18.5 

0.011 

0.074 

29.3 

0.028 

0.398 

36.5 

0.008 

0.437 

59.0 

0.012 

0.065 

19.3 

0.082 

0.020 

34.9 

0.019 

0.191 

20.5 

0.014 

0.116 

3.8 

0.071 

0.094 

16.5 

0.019 

0.123 

27.9 

0.022 

0.188 

11.7 

0.020 

0.098 

53.3 

0.013 

0.378 

89.0 

0.005 

0.284 

18.3 

0.025 

0.206 

6.5 

0.051 

0.207 

10.3 

0.020 

0.001 

5.1 

0.121 

0.178 

13.8 

0.023 

0.038 

6.9 

0.017 

0.034 

16.4 

0.009 

0.021 

55.2 

0.010 

0.043 

76.7 

0.004 

0.062 

17.2 

0.032 

0.097 

7.4 

0.018 

0.106 

93.3 

0.004 

0.064 

33.1 

0.009 

0.090 

52.2 

0.006 

0.025 

73.4 

0.002 

0.038 

112.7 

0.001 

0.098 

55.2 

0.003 

0.034 

107.3 

0.002 

0.028 

53.7 

0.006 

0.053 

13.8 

0.024 

0.019 

17.2 

0.013 

0.044 

30.9 

0.007 

0.030 

40.8 

0.008 

0.046 

2.2 

0.007 

0.630 

4.8 

0.027 

0.102 

8.7 

0.029 

0.024 

6.4 

0.057 

0.075 

2.1 

0.034 

0.100 

10.5 

0.013 

0.167 

8.3 

0.015 

0.020 

27.1 

0.008 

0.025 

50.5 

0.006 

0.036 

8.8 

0.011 

0.005 

16.7 

0.005 

0.010 

14.1 

0.009 

0.026 

Industry 

2396  Automotivs  and  apparel  trimmings 

2397  Schiffli  machine  embroideries 
2399  Fabricated  textile  products,  n.e.c. 
2411  Logging 

2421  Sawmills  and  planing  mills,  general 
2426  Hardwood  dimension  and  flooring  mills 
2429  Special  product  sawmills,  n.e.c. 
2431  Millwork 

2434  Wood  kitchen  cabinets 

2435  Hardwood  veneer  and  plywood 

2436  Softwood  veneer  and  plywood 
2439  Structural  wood  memet>ers,  n.e.c. 
2441  Nailed  wood  boxes  and  shook 

2448  Wood  pallets  and  skids 

2449  Wood  containers,  n.e.c. 

2451  Mobile  homes 

2452  Prefabricated  wood  buildings 
2491  Wood  preserving 

2493  Reconstituted  wood  products 
2499  Wood  products,  n.e.c. 

251 1  Wood  household  furniture 

2512  Upholstered  household  furniture 

2514  Metal  household  furniture 

2515  Mattresses  and  bedsprings 

2517  Wood  television  and  radio  cabinets 
2519  Household  furniture,  n.e.c. 

2521  Wood  office  furniture 

2522  Office  furniture,  except  wood 
2531  Public  building  and  related  furniture 

2541  Wood  partitions  and  fixtures 

2542  Partitions  and  fixtures,  except  wood 
2591  Drapery  hardware  and  blinds  and  shades 
2599  Furniture  and  fixtures,  n.e.c. 

261 1  Pulp  mills 
2621  Paper  mills 
2631  Papertsoard  mills 

2652  Setup  pap«rt>oard  boxes 

2653  Corrugated  and  solkl  fiber  boxes 

2655  Fiber  cans,  drums,  and  similar  products 

2656  Sanitary  food  containers 

2657  Folding  paperboard  boxes 

2671  Paper  coated  and  laminated  packaging 

2672  Paper  coated  and  laminated,  n.e.c. 

2673  Bags:  plastics,  laminated,  and  coated 

2674  Bags:  uncoated  paper  and  multiwall 

2675  Die-cut  paper  and  board 

2676  Sanitary  paper  products 

2677  Envelopes 

2678  Stationery  products 

2679  Converted  paper  products,  n.e.c. 
2711  Newspapers 

2721  Periodicals 


Employment 

Plant  Herfindahl 

Gamma 

44.2 

0.016 

0.074 

5.9 

0.025 

0.153 

30.5 

0.008 

0.005 

85.8 

0.001 

0.062 

148.3 

0.001 

0.039 

29.7 

0.005 

0.062 

2.2 

0.009 

0.374 

89.0 

0.005 

0.013 

67.0 

0.002 

0.011 

20.5 

0.008 

0.050 

38.9 

0.008 

0.187 

24.6 

0.003 

0.026 

5.9 

0.009 

0.018 

25.7 

0.001 

0.006 

5.4 

0.023 

0.026 

39.9 

0.005 

0.037 

25.4 

0.006 

0.025 

11.8 

0.005 

0.029 

22.0 

0.011 

0.029 

56.3 

0.002 

0.006 

135.9 

0.003 

0.077 

82.1 

0.004 

0.130 

30.1 

0.010 

0.013 

24.4 

0.004 

0.007 

5.9 

0.072 

0.010 

5.9 

0.050 

0.004 

31.0 

0.009 

0.045 

49.7 

0.036 

0.050 

21.6 

0.012 

0.008 

40.6 

0.002 

0.003 

33.5 

0.007 

0.011 

20.6 

0.018 

0.006 

29.3 

0.005 

0.007 

14.2 

0.051 

0.047 

129.1 

0.008 

0.039 

52.3 

0.011 

0.024 

8.7 

0.011 

0.037 

105.7 

0.001 

0.001 

12.5 

0.009 

0.006 

15.8 

0.047 

0.028 

50.7 

0.004 

0.002 

15.0 

0.018 

0.018 

30.9 

0.017 

0.010 

36.6 

0.009 

0.011 

17.1 

0.013 

0.025 

15.7 

0.011 

0.011 

38.4 

0.020 

0.033 

27.6 

0.007 

0.008 

11.2 

0.021 

0.025 

29.6 

0.009 

0.012 

434.4 

0.002 

0.002 

110.0 

0.005 

0.067 

Industry 

2731  Book  publishing 

2732  Book  printing 

2741  Miscellanoous  publishing 

2752  Commorcial  printing,  lithographic 

2754  Commsrcial  printing,  gravure 

2759  Commsrcial  printing.  n.«.c. 

2761  Manifold  business  forms 

2771  Grasting  cards 

2782  Blankbooks  and  looseleaf  binders 

2789  Bookbinding  and  related  work 

2791  Typesetting 

2796  Platemaking  services 

2812  Alkalies  and  chlorine 

2813  Industrial  gases 
2816  Inorganic  pigments 

2819  Industrial  inorganic  chemicals,  n.e.c. 

2821  Plastics  materials  and  resins 

2822  Synthetic  rubber 

2823  Cellulosic  manmade  fibers 

2824  Organic  fibers,  noncellulosic 

2833  Medicinals  and  botanicals 

2834  Pharmaceutical  preparations 

2835  Diagnostic  substances 

2836  Biological  products,  except  diagnostic 

2841  Soap  and  other  detergents 

2842  Polishes  and  sanitation  goods 

2843  Surface  active  agents 

2844  Toilet  preparations 

2851  Paints  and  allied  products 
2861  Gum  and  wood  chemicals 
2865  Cyclic  crudes  and  intermediates 
2869  Industrial  organic  chemicals,  n.e.c. 

2873  Nitrogenous  fertilizers 

2874  Phosphatic  fertilizers 

2875  Fertilizers,  mixing  only 
2879  Agricultural  chemicals,  n.e.c. 

2891  Adhesives  and  sealants 

2892  Explosives 

2893  Printing  Ink 
2895  Carbon  black 

2899  Chemical  preparations,  n.e.c. 
291 1  Petroleum  refining 

2951  Asphalt  paving  mixtures  and  btocks 

2952  Asphalt  felts  and  coatings 
2992  Lubricating  oils  ar>d  greases 

2999  Petroleum  and  coal  products,  n.e.c. 

301 1  Tires  and  inner  tubes 

3021  Rubber  and  plastics  footwear 

3052  Rubber  and  plastics  hose  and  belting 

3053  Gaskets,  packing,  and  sealing  devices 
3061  Mechanical  rubber  goods 

3069  Fabricated  rubber  products,  n.e.c. 


Employment 

Plant  Herfindahl 

Gamma 

70.1 

0.008 

0.062 

43.5 

0.012 

0.011 

69.5 

0.005 

0.008 

403.9 

0.000 

0.004 

23.8 

0.032 

0.016 

125.8 

0.001 

0.004 

53.3 

0.003 

0.003 

21.5 

0.091 

0.037 

39.1 

0.007 

0.007 

29.7 

0.005 

0.020 

37.6 

0.002 

0.014 

31.8 

0.002 

0.010 

5.0 

0.061 

0.058 

8.1 

0.005 

0.011 

8.3 

0.041 

0.031 

72.2 

0.053 

0.017 

56.3 

0.012 

0.029 

10.4 

0.063 

0.165 

10.5 

0.224 

0.159 

45.4 

0.043 

0.140 

11.6 

0.042 

0.089 

131.6 

0.015 

0.023 

15.4 

0.033 

0.059 

13.3 

0.023 

0.010 

31.7 

0.016 

0.004 

20.6 

0.010 

0.018 

9.1 

0.017 

0.040 

57.9 

0.011 

0.055 

55.2 

0.003 

0.007 

2.6 

0.041 

0.061 

22.8 

0.019 

0.010 

100.3 

0.012 

0.069 

7.4 

0.025 

0.031 

9.4 

0.066 

0.291 

7.5 

0.006 

0.020 

16.1 

0.038 

0.031 

20.9 

0.005 

0.012 

13.8 

0.113 

0.003 

11.1 

0.005 

0.015 

1.8 

0.054 

0.300 

37.9 

0.006 

0.006 

74.6 

0.011 

0.088 

14.6 

0.003 

0.009 

13.5 

0.009 

0.010 

11.2 

0.007 

0.013 

1.9 

0.027 

0.061 

65.4 

0.025 

0.038 

10.9 

0.060 

-0.013 

23.2 

0.026 

0.038 

28.4 

0.011 

0.016 

49.8 

0.008 

0.047 

54.3 

0.006 

0.022 

Industry 
3061  Unsupportod  plastics  film  and  sheet 

3082  Unsupported  plastics  profile  shapes 

3083  Laminated  plastics  plate,  sheet,  and  profile  shapes 

3084  Plastics  pipe 

3085  Plastics  bottiet 

3086  Plastics  foam  products 

3067  Custom  compounding  of  purchased  plastics  resins 
3088  Plastics  plumbing  fixtures 
3069  Plastics  products,  n.e.c. 
3111  Leather  tanning  and  finishing 
3131  Footwear  cut  stock 

3142  House  slippers 

3143  Men's  footwear,  except  athletic 

3144  Women's  footwear,  except  athletic 
3149  Footwear,  except  rubber,  n.e.c. 
3151  Leather  gloves  and  mittens 

3161  Luggage 

3171  Women's  handbags  and  purses 

3172  Personal  leather  goods,  n.e.c. 
3199  Leather  goods,  n.e.c. 

321 1  Flat  glass 

3221  Glass  containers 

3229  Pressed  and  blown  glass,  n.e.c. 

3231  Products  of  purchased  glass 

3241  Cement,  hydraulic 

3251  Brick  and  structural  clay  tile 

3253  Ceramic  wall  and  floor  tile 

3255  Clay  refractories 

3259  Structural  clay  products,  n.e.c. 

3261  Vrb'eous  plumbing  fixtures 

3262  Vitreous  china  table  and  kitchenware 

3263  Semivitreous  table  and  kitchenware 

3264  Porcelain  electrical  supplies 
3269  Pottery  products,  n.e.c. 

3271  Concrete  block  and  brick 

3272  Concrete  products,  n.e.c. 

3273  Ready-mixed  concrete 

3274  Ume 

3275  Gypsum  products 

3281  Cut  stone  and  stone  products 

3291  Abrasive  products 

3292  Asbestos  products 

3295  Minerals,  ground  or  treated 

3296  Mineral  wool 

3297  Nonclay  refractories 

3299  Nonmetallic  mineral  products,  n.e.c. 

3312  Blast  furnaces  and  steel  mills 

3313  Electrometallurglal  products 

3315  Steel  wire  and  related  products 

3316  Cold  finishing  of  steel  shapes 

3317  Steel  pipe  and  tubes 

3321  Gray  and  ductile  iron  foundries 


Employment 

Plant  Herfindahl 

Gamma 

48.4 

0.006 

0.006 

25.2 

0.007 

0.005 

17.3 

0.025 

0.005 

12.5 

0.008 

0.010 

25.1 

0.007 

0.012 

61.3 

0.004 

0.004 

17.3 

0.008 

0.012 

7.5 

0.023 

0.014 

384.9 

0.001 

0.005 

14.6 

0.013 

0.025 

5.0 

0.032 

0.142 

3.7 

0.104 

0.066 

31.6 

0.016 

0.073 

26.6 

0.012 

0.055 

9.2 

0.025 

0.088 

3.1 

0.028 

0.035 

11.4 

0.027 

0.041 

9.5 

0.021 

0.144 

7.2 

0.024 

0.059 

7.1 

0.011 

0.023 

14.6 

0.055 

0.019 

41.1 

0.013 

0.011 

36.3 

0.020 

0.038 

51.1 

0.005 

0.002 

19.1 

0.009 

0.010 

16.6 

0.007 

0.036 

9.5 

0.039 

0.023 

6.4 

0.027 

0.078 

2.1 

0.048 

0.160 

9.7 

0.041 

0.014 

5.4 

0.126 

-0.001 

1.8 

0.109 

0.086 

10.7 

0.030 

0.044 

10.5 

0.016 

0.012 

18.6 

0.002 

0.004 

70.0 

0.001 

0.012 

96.8 

0.001 

0.010 

5.7 

0.033 

0.063 

12.1 

0.013 

0.013 

12.5 

0.011 

0.036 

23.4 

0.038 

0.028 

4.0 

0.107 

0.009 

8.6 

0.011 

0.005 

21.5 

0.020 

0.015 

7.7 

0.020 

0.042 

7.6 

0.009 

0.004 

188.1 

0.018 

0.067 

3.9 

0.072 

0.148 

24.7 

0.012 

0.013 

16.4 

0.027 

0.032 

19.6 

0.010 

0.038 

82.4 

0.011 

0.029 

Industry 
3322  Malloabia  iron  foundries 

3324  Steel  investment  foundries 

3325  Steel  foundries,  n.e.c. 
3331  Primary  copper 
3334  Primary  aluminum 

3339  Primary  nonferrous  metals,  n.e.c. 
3341  Secondary  nonferrous  metals 
3351  Copper  rolling  and  drawing 

3353  Aluminum  sheet,  plate,  and  foil 

3354  Aluminum  extruded  products 

3355  Aluminum  rolling  and  drawing,  n.e.c. 

3356  Nonferrous  rolling  and  drawing,  n.e.c. 

3357  Nonferrous  wiredrawing  and  insulating 

3363  Aluminum  die-castings 

3364  Nonferrous  die-casting,  except  aluminum 

3365  Aluminum  foundries 

3366  Copper  foundries 

3369  Nonferrous  foundries,  n.e.c. 

3398  Metal  heat  treating 

3399  Primary  metal  products,  n.e.c. 

3411  Metal  cans 

3412  Metal  barrels,  drums,  and  pails 
3421  Cutlery 

3423  Hand  and  edge  tools,  n.e.c. 
3425  Saw  blades  and  handsaws 
3429  Hardware,  n.e.c. 

3431  Metal  sanitary  ware 

3432  Plumbing  fixture  fittings  and  trim 

3433  Heating  equipment,  except  electric 

3441  Fabricated  structural  metal 

3442  Metal  doors,  sash,  and  trim 

3443  Fabricated  plate  work  (boiler  shops) 

3444  Sheet  metal  work 
3446  Architectural  metal  work 

3448  Prefabricated  metal  buildings 

3449  Miscellaneous  metal  work 

3451  Screw  machine  products 

3452  Bolts,  nuts,  rivets,  and  washers 

3462  Iron  and  steel  forgings 

3463  Nonferrous  forgings 

3465  Automotive  stampings 

3466  Crowns  and  ckMures 
3469  Metal  stampings,  n.e.c. 
3471  Plating  and  polishing 

3479  Metal  coating  and  allied  services 

3482  Small  arms  ammunition 

3483  Ammunition,  except  for  small  arms,  n.e.c. 

3484  Small  arms 

3489  Ordnance  and  accessories,  n.e.c. 

3491  Industrial  valves 

3492  Fluid  power  valves  and  hose  fittings 

3493  Steel  springs,  except  wire 


Employment 

Plant  Herfindahl 

Gamma 

4.2 

0.197 

0.072 

20.3 

0.040 

-0.003 

22.9 

0.012 

0.040 

3.3 

0.135 

0.194 

17.3 

0.050 

0.053 

11.0 

0.044 

0.004 

12.5 

0.008 

0.016 

22.6 

0.029 

0.018 

26.1 

0.063 

0.009 

30.7 

0.013 

0.001 

0.9 

0.084 

0.031 

17.9 

0.031 

0.016 

64.9 

0.008 

0.018 

28.1 

0.010 

0.021 

12.9 

0.010 

0.036 

26.3 

0.008 

0.021 

8.2 

0.007 

0.012 

4.0 

0.117 

0.103 

18.0 

0.004 

0.026 

13.8 

0.105 

0.059 

39.4 

0.006 

0.009 

8.7 

0.014 

0.042 

10.5 

0.039 

0.056 

41.9 

0.008 

0.008 

7.7 

0.039 

0.039 

85.2 

0.007 

0.008 

8.0 

0.064 

0.030 

17.1 

0.023 

0.003 

20.5 

0.008 

0.000 

80.9 

0.006 

0.004 

74.7 

0.003 

0.003 

74.7 

0.004 

0.010 

100.2 

0.001 

0.003 

28.0 

0.004 

0.004 

25.8 

0.009 

0.006 

22.9 

0.006 

0.014 

42.7 

0.002 

0.027 

52.0 

0.006 

0.029 

26.6 

0.017 

0.024 

7.3 

0.082 

0.022 

119.8 

0.013 

0.177 

6.1 

0.056 

0.039 

95.5 

0.002 

0.018 

71.1 

0.001 

0.012 

41.5 

0.002 

0.014 

9.0 

0.184 

-0.004 

41.5 

0.041 

0.003 

13.3 

0.067 

0.080 

23.9 

0.166 

0.004 

45.9 

0.009 

0.006 

27.9 

0.010 

0.037 

5.0 

0.024 

0.048 

Industry 

3494  Valv«s  and  pip«  fittings,  n.e.c. 

3495  Wire  springs 

3496  Misc«llan«ous  fabricatad  wire  products 

3497  Metal  foil  and  leaf 

3498  Fabricated  pipe  and  fittings 

3499  Fabricated  metal  products,  n.e.c. 
351 1  Turbines  and  turbine  generator  sets 
3519  Internal  combustion  engines,  n.e.c. 

3523  Farm  machinery  and  equipment 

3524  Lawn  and  garden  equipment 

3531  Construction  machinery 

3532  Mining  machinery 

3533  Oil  and  gas  field  machinery 

3534  Elevators  and  moving  stairways 

3535  Conveyors  and  conveying  equipment 

3536  Hoists,  cranes,  and  monorails 

3537  Industrial  trucks  and  tractors 

3541  Machine  tools,  metal  cutting  types 

3542  Machine  tools,  metal  forming  types 

3543  Industrial  pattems 

3544  Special  dies,  tools,  jigs,  and  fixtures 

3545  Machine  tool  accessories 

3546  Power-driven  handtools 

3547  Rolling  mill  machinery 

3548  Welding  apparatus 

3549  Metalworking  machinery,  n.e.c. 

3552  Textile  machinery 

3553  Woodworking  machinery 

3554  Paper  industries  machinery 

3555  Printing  trades  machinery 

3556  Food  products  machinery 

3559  Special  industry  machinery,  n.e.c. 

3561  Pumps  and  pumping  equipment 

3562  Ball  and  roller  bearings 

3563  Air  and  gas  compressors 

3564  Blowers  and  fans 

3565  Packaging  machinery 

3566  Speed  changers,  drives,  and  gears 

3567  Industrial  furnaces  arKi  ovens 

3568  Power  transmission  equipment,  n.e.c. 

3569  General  Industrial  machinery,  n.e.c. 

3571  Electronic  computers 

3572  Computer  storage  devices 
3575  Computer  terminals 

3577  Computer  peripheral  equipment,  n.e.c. 

3578  Calculating  and  accounting  equipment 

3579  Office  machines,  n.e.c. 

3581  Automatic  vertding  machines 

3582  Commercial  laundry  equipment 

3585  Refrigeration  and  heating  equipment 

3586  Measuring  and  dispensing  pumps 
3589  Service  industry  machinery,  n.e.c. 


Employment 

Plant  Herfindahl 

Gamma 

25.1 

0.010 

0.017 

19.7 

0.009 

0.014 

35.1 

0.003 

0.004 

10.4 

0.033 

0.033 

20.0 

0.004 

0.020 

72.5 

0.002 

0.006 

22.9 

0.091 

0.023 

64.0 

0.034 

0.070 

57.0 

0.013 

0.064 

24.9 

0.043 

0.014 

81.1 

0.016 

0.061 

13.6 

0.016 

0.057 

24.8 

0.015 

0.433 

10.2 

0.028 

-0.001 

31.5 

0.005 

0.018 

7.0 

0.020 

0.015 

20.1 

0.016 

0.004 

31.7 

0.019 

0.035 

13.8 

0.018 

0.071 

8.6 

0.006 

0.051 

114.4 

0.001 

0.053 

48.5 

0.003 

0.037 

16.8 

0.037 

0.045 

3.9 

0.067 

0.084 

18.7 

0.028 

0.040 

11.3 

0.011 

0.041 

15.6 

0.012 

0.165 

8.9 

0.016 

0.033 

17.1 

0.022 

0.096 

25.0 

0.032 

0.016 

19.2 

0.008 

0.014 

83.3 

0.003 

0.007 

35.2 

0.010 

0.008 

36.9 

0.021 

0.043 

23.8 

0.021 

0.020 

24.8 

0.008 

0.003 

22.6 

0.010 

0.018 

17.9 

0.019 

0.019 

16.6 

0.010 

0.006 

22.0 

0.014 

0.014 

40.6 

0.004 

0.004 

151.9 

0.019 

0.059 

43.3 

0.113 

0.142 

15.0 

0.046 

0.005 

76.2 

0.030 

0.031 

12.8 

0.060 

0.008 

28.5 

0.053 

0.015 

7.9 

0.062 

0.005 

4.6 

0.054 

0.019 

133.3 

0.008 

0.011 

9.4 

0.083 

0.002 

35.2 

0.005 

0.014 

Industry 

3592  Carfourators,  pistons,  rings,  and  valves 

3593  Fluid  powsr  cylinders  and  actuators 

3594  Fluid  power  pumps  and  motors 

3596  Scales  and  balances,  except  laboratory 
3599  Industrial  machinery,  n.e.c. 

3612  Transformers,  except  electronic 

3613  Switchgear  and  switchboard  apparatus 
3621  Motors  and  generators 

3624  Cart>on  and  graphite  products 

3625  Relays  and  industrial  controls 
3629  Electrical  industrial  apparatus,  n.e.c. 

3631  Household  cooking  equipment 

3632  Household  refrigerators  and  freezers 

3633  Household  laundry  equipment 

3634  Electric  housewares  and  fans 

3635  Household  vacuum  cleaners 
3639  Household  appliances,  n.e.c. 
3641  Electric  lamp  bulbs  and  tubes 

3643  Current-carrying  wiring  devices 

3644  Noncurrent-carrying  wiring  devices 

3645  Residential  lighting  fixtures 

3646  Commercial  lighting  fixtures 

3647  Vehicular  lighting  equipment 

3648  Lighting  equipment,  n.e.c. 

3651  Household  audio  and  video  equipment 

3652  Prerecorded  records  and  tapes 
3661  Telephone  and  telegraph  apparatus 

3663  Radio  and  television  communications  equipment 
3669  Communications  equipment,  n.e.c. 

3671  Electron  tubes 

3672  Printed  circuit  boards 

3674  Semiconductors  and  related  devices 

3675  Electronic  capacitors 

3676  Electronic  resistors 

3677  Electronic  coils  and  transformers 

3678  Electronic  connectors 

3679  Electronic  components,  n.e.c. 

3691  Storage  batteries 

3692  Primary  batteries,  dry  and  wet 

3694  Engine  electrical  equipment 

3695  Magnetic  and  optical  recording  madia 
3699  Electrical  aqulpmant  and  supplies,  n.e.c. 
371 1  Motor  vahidas  and  car  bodies 

3713  Truck  arid  bus  bodies 

3714  Motor  vehicle  parts  arKl  accessories 

3715  Truck  trailers 

3716  Motor  homes 
3721  Aircraft 

3724  Aircraft  engines  and  engine  parts 
3726  Aircraft  parts  and  equipment,  n.e.c. 

3731  Ship  building  and  repairing 

3732  Boat  building  and  repairing 


Employment 

Plant  Horfindahl 

Gamma 

21.7 

0.038 

0.042 

20.2 

0.052 

0.025 

14.8 

0.034 

0.003 

6.7 

0.027 

0.023 

228.5 

0.000 

0.005 

32.2 

0.016 

0.021 

44.8 

0.010 

0.008 

74.6 

0.008 

0.021 

9.8 

0.033 

0.042 

66.6 

0.010 

0.008 

14.5 

0.017 

0.010 

21.9 

0.050 

0.030 

25.7 

0.107 

0.035 

16.7 

0.128 

0.124 

25.1 

0.019 

0.107 

11.3 

0.182 

-0.009 

16.0 

0.061 

0.030 

22.2 

0.027 

0.033 

47.9 

0.017 

0.009 

21.5 

0.023 

0.012 

22.5 

0.009 

0.027 

22.7 

0.022 

0.018 

15.5 

0.139 

0.022 

14.4 

0.017 

0.010 

30.9 

0.035 

0.016 

13.3 

0.039 

-0.008 

112.3 

0.021 

0.009 

126.0 

0.015 

0.021 

21.9 

0.017 

0.030 

28.4 

0.057 

0.043 

66.6 

0.005 

0.041 

184.6 

0.014 

0.064 

21.7 

0.023 

0.029 

15.7 

0.022 

0.016 

23.9 

0.009 

0.018 

42.8 

0.017 

0.036 

162.6 

0.008 

0.022 

24.2 

0.017 

0.010 

10.7 

0.045 

0.049 

67.3 

0.045 

0.054 

25.6 

0.028 

0.085 

60.3 

0.008 

0.015 

281.3 

0.016 

0.127 

37.8 

0.009 

0.008 

389.6 

0.006 

0.089 

27.5 

0.013 

0.014 

15.1 

0.055 

0.150 

268.2 

0.053 

0.023 

139.6 

0.042 

0.047 

188.2 

0.029 

0.032 

120.2 

0.080 

0.014 

57.2 

0.005 

0.046 

Industry 
3743  Railroad  squipment 
3751  Motorcyclas,  bicyclos.  and  parts 
3761  Guldad  missilas  and  spaca  vahlclas 
3764  Space  propulsion  units  and  parts 
3769  Spaca  vahicia  aquipmant,  n.a.c. 
3792  Traval  trailars  arKi  campars 
3795  Tanks  and  tank  components 
3799  Transportation  equipment,  n.a.c. 
3812  Search  and  navigation  equipment 

3821  Laboratory  apparatus  and  furniture 

3822  Environmental  controls 

3823  Process  control  instruments 

3824  Fluid  meters  and  counting  devices 

3825  Instruments  to  measure  electricity 

3826  Analytical  instruments 

3827  Optical  instruments  and  lenses 

3829  Measuring  and  contolling  devices,  n.e.c. 

3841  Surgical  and  medical  instruments 

3842  Surgical  appliances  and  supplies 

3843  Dental  equipment  and  supplies 

3844  X-ray  apparatus  and  tubes 

3845  Electromedical  equipment 
3851  Opthalmic  goods 

3861  Photographic  equipment  and  supplies 
3873  Watches,  clocks,  watchcases.  and  parts 
391 1  Jewelry,  precious  metal 

3914  Silverware  and  plated  ware 

3915  Jewelers'  materials  and  lapidary  work 
3931  Musical  instruments 

3942  Dolls  and  stuffed  toys 

3944  Games,  toys,  and  children's  vehicles 

3949  Sporting  and  athletic  goods,  n.e.c. 

3951  Pens  and  mechanical  pencils 

3952  Lead  pencils  and  art  goods 

3953  Marking  devices 

3955  CartMn  paper  and  inked  ribbons 

3961  Costume  jewelry 

3965  Fasteners,  buttons,  needles,  and  pins 

3991  Brooms  and  brush** 

3993  Signs  and  adv*rtislng  spscialtifls 

3995  Burial  caskati 

3996  Hard  surfac*  floor  eov*rings,  n.*.c. 
3999  Manufacturing  industri**,  n.*.c. 


Employment 

Plant  Herfindahl 

Gamma 

22.1 

0.085 

0.123 

7.4 

0.077 

0.010 

166.7 

0.046 

0.249 

31.8 

0.145 

0.112 

15.1 

0.157 

0.005 

17.2 

0.011 

0.087 

16.7 

0.157 

0.023 

15.4 

0.015 

0.021 

369.4 

0.011 

0.039 

17.1 

0.020 

0.000 

26.5 

0.035 

0.011 

53.3 

0.010 

0.017 

10.1 

0.032 

0.022 

85.2 

0.014 

0.031 

31.2 

0.014 

0.039 

20.1 

0.027 

0.061 

41.0 

0.015 

0.004 

73.1 

0.007 

0.011 

78.5 

0.005 

0.005 

14.6 

0.017 

0.022 

8.7 

0.049 

0.017 

29.2 

0.021 

0.025 

24.2 

0.020 

0.027 

88.0 

0.067 

0.174 

11.8 

0.031 

0.005 

35.5 

0.005 

0.095 

6.9 

0.065 

0.049 

7.1 

0.025 

0.298 

12.2 

0.017 

0.014 

4.4 

0.027 

0.086 

30.9 

0.017 

0.011 

53.6 

0.005 

0.003 

8.4 

0.048 

0.030 

5.6 

0.045 

0.030 

7.5 

0.007 

0.005 

7.3 

0.035 

0.008 

22.2 

0.017 

0.320 

9.6 

0.018 

0.042 

12.3 

0.014 

0.007 

66.3 

0.001 

0.006 

8.7 

0.026 

0.050 

7.6 

0.139 

0.097 

68.3 

0.003 

0.008 
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